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PREFACE 


To Volume 1 


This work represents our effort to present the basic concepts of vector and tensor analysis. Volume 
I begins with a brief discussion of algebraic structures followed by a rather detailed discussion of 
the algebra of vectors and tensors. Volume II begins with a discussion of Euclidean Manifolds 
which leads to a development of the analytical and geometrical aspects of vector and tensor fields. 
We have not included a discussion of general differentiable manifolds. However, we have included 
a chapter on vector and tensor fields defined on Hypersurfaces in a Euclidean Manifold. 


In preparing this two volume work our intention is to present to Engineering and Science 
students a modern introduction to vectors and tensors. Traditional courses on applied mathematics 
have emphasized problem solving techniques rather than the systematic development of concepts. 
As a result, it is possible for such courses to become terminal mathematics courses rather than 
courses which equip the student to develop his or her understanding further. 


As Engineering students our courses on vectors and tensors were taught in the traditional 
way. We learned to identify vectors and tensors by formal transformation rules rather than by their 
common mathematical structure. The subject seemed to consist of nothing but a collection of 
mathematical manipulations of long equations decorated by a multitude of subscripts and 
superscripts. Prior to our applying vector and tensor analysis to our research area of modern 
continuum mechanics, we almost had to relearn the subject. Therefore, one of our objectives in 
writing this book is to make available a modern introductory textbook suitable for the first in-depth 
exposure to vectors and tensors. Because of our interest in applications, it is our hope that this 
book will aid students in their efforts to use vectors and tensors in applied areas. 


The presentation of the basic mathematical concepts is, we hope, as clear and brief as 
possible without being overly abstract. Since we have written an introductory text, no attempt has 
been made to include every possible topic. The topics we have included tend to reflect our 
personal bias. We make no claim that there are not other introductory topics which could have 
been included. 


Basically the text was designed in order that each volume could be used in a one-semester 
course. We feel Volume I is suitable for an introductory linear algebra course of one semester. 
Given this course, or an equivalent, Volume II is suitable for a one semester course on vector and 
tensor analysis. Many exercises are included in each volume. However, it is likely that teachers 
will wish to generate additional exercises. Several times during the preparation of this book we 
taught a one semester course to students with a very limited background in linear algebra and no 
background in tensor analysis. Typically these students were majoring in Engineering or one of the 
Physical Sciences. However, we occasionally had students from the Social Sciences. For this one 
semester course, we covered the material in Chapters 0, 3, 4, 5, 7 and 8 from Volume I and selected 
topics from Chapters 9, 10, and 11 from Volume 2. As to level, our classes have contained juniors, 
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seniors and graduate students. These students seemed to experience no unusual difficulty with the 
material. 


It is a pleasure to acknowledge our indebtedness to our students for their help and 
forbearance. Also, we wish to thank the U. S. National Science Foundation for its support during 
the preparation of this work. We especially wish to express our appreciation for the patience and 
understanding of our wives and children during the extended period this work was in preparation. 


Houston, Texas R.M.B. 
C.-C.W. 
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Chapter 0 


ELEMENTARY MATRIX THEORY 


When we introduce the various types of structures essential to the study of vectors and 
tensors, it is convenient in many cases to illustrate these structures by examples involving matrices. 
It is for this reason we are including a very brief introduction to matrix theory here. We shall not 
make any effort toward rigor in this chapter. In Chapter V we shall return to the subject of matrices 
and augment, in a more careful fashion, the material presented here. 


An M by N matrix A is a rectangular array of real or complex numbers A, arranged in 


M rows and N columns. A matrix is often written 


[ A, Ay we R Aw 
Ay Ay os Ayn 
A=| - (0.1) 
| Ayı Ayo (et Aun | 


and the numbers A, are called the elements or components of A. The matrix A is called a real 


matrix or a complex matrix according to whether the components of A are real numbers or 
complex numbers. 


A matrix of M rows and N columns is said to be of order M by N orM xN. Itis 
customary to enclose the array with brackets, parentheses or double straight lines. We shall adopt 
the notation in (0.1). The location of the indices is sometimes modified to the forms A’, A’,, or 
A’. Throughout this chapter the placement of the indices is unimportant and shall always be 
written as in (0.1). The elements A, A»... Ay are the elements of the i" row of A, and the 
elements A,,A,,,..., Ay, are the elements of the k" column. The convention is that the first index 


denotes the row and the second the column. 


A row matrix is a 1x N matrix, e.g., 


[Au Ay “= Ax] 


while a column matrix is an M x1 matrix, e.g., 
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An | 
A, 


The matrix A is often written simply 
A=| A, | (0.2) 


A square matrix is an N xN matrix. In a square matrix A, the elements A,,,A,,,...,Ayy are its 
diagonal elements. The sum of the diagonal elements of a square matrix A is called the trace and 


is written tr A. Two matrices A and B are said to be equal if they are identical. That is, A and 
B have the same number of rows and the same number of columns and 
A, = B;, i=1,...,N, j=l..,M 


A matrix, every element of which is zero, is called the zero matrix and is written simply 0. 


If A= [ A; | and B= |B, | are two M x N matrices, their sum (difference) is an M x N 
matrix A+B (A-B) whose elements are A, +B, (A;—B,). Thus 


AtB=|A,+B, | (0.3) 


Two matrices of the same order are said to be conformable for addition and subtraction. Addition 
and subtraction are not defined for matrices which are not conformable. If A is a number and A is 
a matrix, then 2A is a matrix given by 
AA=[AA, |= AA (0.4) 
Therefore, 
-A=(-1)A=[-A, | (0.5) 
These definitions of addition and subtraction and, multiplication by a number imply that 


A+B=B+A (0.6) 
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A+(B+C)=(A+B)+C (0.7) 
A+0=A (0.8) 
A-A=0 (0.9) 
A(A+ B)=2A+AB (0.10) 
(A+ WA=AA+ uA (0.11) 
and 
1A=A (0.12) 


where A,B and C are as assumed to be conformable. 


If A isan M xN matrix and Bis an N xK matrix, then the product of Bby A is written 
AB and is an M x K matrix with elements È AB, ,i=1,...,M , s=1,...,K . For example, if 


A, Ap 
A=|A, Ay and B= | 
A, Ay 


then ABis a 3x2 matrix given by 


a 
AB= A, A, a 
A; Ay 21 22 


AB; + ABa A,B, + A,B, 
=| A,B, + ABa A,B, + A,B, 
Ay, B,, + ABa Ag) Byy + Ag Boo 


The product AB is defined only when the number of columns of A is equal to the number 
of rows of B. If this is the case, A is said to be conformable to B for multiplication. If A is 
conformable to B, then B is not necessarily conformable to A. Even if BA is defined, it is not 
necessarily equal to AB. On the assumption that A, B,and C are conformable for the indicated 
sums and products, it is possible to show that 


A(B+C) = AB+AC (0.13) 


(A+B)C =AC+BC (0.14) 
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and 

A(BC) = (AB)C (0.15) 
However, AB + BAin general, AB =0 does not imply A=0 or B=0, and AB = AC does not 
necessarily imply B=C. 


The square matrix I defined by 


I- | (0.16) 


On 20: ees: ae 


is the identity matrix. The identity matrix is a special case of a diagonal matrix which has all of its 
elements zero except the diagonal ones. A square matrix A whose elements satisfy A; =0, i> j, 


is called an upper triangular matrix , i.e., 


[Ay A, A. a eo Ay | 
Ay Ay, yi. 4 Ay 
0 0 A, 
A= 
OM 0 M0) 4p eS. TAs 


A lower triangular matrix can be defined in a similar fashion. A diagonal matrix is both an upper 
triangular matrix and a lower triangular matrix. 


If A and B are square matrices of the same order such that AB = BA = I , then B is called 


the inverse of A and we write B= A’. Also, Ais the inverse of B,iie. A=B“. If A hasan 
inverse it is said to be nonsingular. If A and B are square matrices of the same order with inverses 


A‘and B'respectively, then 


(AB) = B‘A* (0.17) 


Equation (0.17) follows because 
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(AB)B'A'=A(BB")A' = AIA“ = AA' =I 
and similarly 
(B'A')AB=I 


The matrix of order N x M obtained by interchanging the rows and columns of an M x N 
matrix A is called the transpose of A and is denoted by A’ . It is easily shown that 


(A) =A (0.18) 
(4A) = AAT (0.19) 
(A+B) =A" +B’ (0.20) 

and 
(AB)' = B'A" (0.21) 


A square matrix A is symmetric if A= A" and skew-symmetric if A=—A’. Therefore, for a 
symmetric matrix 


A, =A; (0.22) 
and for a skew-symmetric matrix 
A, =—A, (0.23) 


Equation (0.23) implies that the diagonal elements of a skew symmetric-matrix are all zero. Every 
square matrix A can be written uniquely as the sum of a symmetric matrix and a skew-symmetric 
matrix, namely 


A=Ż(A+A")+Ž(A-A") (0.24) 


If A is a square matrix, its determinant is written det A. The reader is assumed at this 
stage to have some familiarity with the computational properties of determinants. In particular, it 


should be known that 


det A = det A’ and det AB = (det A)(det B) (0.25) 
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If A isan NxN square matrix, we can construct an (N —1)x(N —1) square matrix by removing 
the i" row and the j" column. The determinant of this matrix is denoted by M and is the minor of 
A,. The cofactor of A, is defined by 


cof A, =(-1)"’M, (0.26) 
For example, if 
A= p a (0.27) 
A, Ay 


then 


M,,=Ay, M, =A, M,, =A), M,, =A, 
cofA,=A, cof A, =-A,,, etc. 


The adjoint of an N x N matrix A, written adjA, is an NxN matrix given by 


[cof A, cofA, - - + cofA,, | 
cof A, cof A, > - - cofA,, 
adjA=| (0.28) 
[Cot Ay cof Ay > > > cof Aw | 


The reader is cautioned that the designation “adjoint” is used in a different context in Section 18. 
However, no confusion should arise since the adjoint as defined in Section 18 will be designated by 
a different symbol. Itis possible to show that 

A(adj A) = (det A)I = (adj A)A (0.29) 


We shall prove (0.29) in general later in Section 21; so we shall be content to establish it for N = 2 
here. For N =2 


Then 
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naaa a fale lis te aes 
Therefore 
A(adj A) = (A,A — AyA,)I = (det AI 
Likewise 


(adj A)A = (det A)I 


If det A#0, then (0.29) shows that the inverse A™ exists and is given by 


Ate adjA 


= 0.30 
det A ( ) 


Of course, if A ‘exists, then (det A)(det A) =det I =140. Thus nonsingular matrices are those 
with a nonzero determinant. 


Matrix notation is convenient for manipulations of systems of linear algebraic equations. 
For example, if we have the set of equations 


A,X, + AX, + ÁX; +e + AnyXy =Ni 
A,X, + AX + ÁX; +++ AyyXy =) 


AyiX + Ay oX> + Ay 3X3 APS ak: Ayn = Yn 


then they can be written 


[ Ay A) Sn An xX, Yı 
An An Ayn X2 Yə 
Ayı Ay» Ann Xy Yn | 


The above matrix equation can now be written in the compact notation 


AX =Y (0.31) 


10 Chap. 0 ° ELEMENTARY MATRIX THEORY 


and if Ais nonsingular the solution is 
X=A'Y (0.32) 


For N = 2, we can immediately write 


x = 1 Ay -Ap Yı (0 33) 
X2 det A -A A, Jı l 


Exercises 


0.1 Add the matrices 


Add the matrices 


0.2 Add 


0.3 Multiply 


; 2i 8 
2i 3 7+2i ; 
f ‘ 1 Gi 

5 4+3i i i 
3i 2 


0.4 Show that the product of two upper (lower) triangular matrices is an upper lower triangular 
matrix. Further, if 


a=[ 4], B=[B,] 
are upper (lower) triangular matrices of order N x N , then 


(AB); = (BA); = 4,B; 
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for all i =1,..., N. The off diagonal elements (AB), and (BA);, i# j, generally are not 


equal, however. 


0.5 What is the transpose of 


2i 3 7+ 2i 
5 44+3i i 


0.6 What is the inverse of 


0.7 Solve the equations 


by use of (0.33). 


Chapter 1 


SETS, RELATIONS, AND FUNCTIONS 


The purpose of this chapter is to introduce an initial vocabulary and some basic concepts. 
Most of the readers are expected to be somewhat familiar with this material; so its treatment is 
brief. 


Section 1. Sets and Set Algebra 


The concept of a set is regarded here as primitive. It is a collection or family of things 
viewed as a simple entity. The things in the set are called elements or members of the set. They 
are said to be contained in or to belong to the set. Sets will generally be denoted by upper case 
script letters, £, %,¢,Q9, ..., and elements by lower case letters a,b,c,d,.... The sets of complex 
numbers, real numbers and integers will be denoted by #¢, #, and ¥, respectively. The notation 
a e £ means that the element a is contained in the set £ ; if ais not an element of £ , the 
notation a ¢ £ is employed. To denote the set whose elements are a,b,c, andd, the notation 


{a, b,c,d } is employed. In mathematics a set is not generally a collection of unrelated objects like a 
tree, a doorknob and a concept, but rather it is a collection which share some common property like 


the vineyards of France which share the common property of being in France or the real numbers 
which share the common property of being real. A set whose elements are determined by their 


possession of a certain property is denoted by {x |P(x)} , where x denotes a typical element and 


P(x) is the property which determines x to be in the set. 


If wand Fare sets, Bis said to be a subset of wif every element of Z is also an element 
of £. It is customary to indicate that # is a subset of £ by the notation # c £ , which may be 
read as “ Z is contained in x,” or £ > Z which may be read as “£ contains #.” For example, 
the set of integers J is a subset of the set of real numbers #, S c #. Equality of two sets s£ and 
@ is said to exist if Wis a subset of Z and Z is a subset of s£ ; in equation form 


A -BAP and Zc A (1.1) 
A nonempty subset 4 of # is called a proper subset of x£ if Z is not equal to £ . The set of 
integers ¥ is actually a proper subset of the real numbers. The empty set or null set is the set with 


no elements and is denoted by ©. The singleton is the set containing a single element a and is 
denoted by {a}. A set whose elements are sets is often called a class. 
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Some operations on sets which yield other sets will now be introduced. The union of the 
sets £% and Z is the set of all elements that are either in the set . or in the set #. The union of 
£ and Z is denoted by £ U Z 
AISB={a\aewWw orae B} (1.2) 
It is easy to show that the operation of forming a union is commutative, 
AI B= FI A (1.3) 
and associative 


A I(BICG)=(AIB)IE (1.4) 


The intersection of the sets and @ is the set of elements that are in both £ and #. The 
intersection is denoted by £ ^Z and is specified by 


A1B={a\ae# andae B} (1.5) 
Again, it can be shown that the operation of intersection is commutative 
AVB=BA\A (1.6) 
and associative 
AVMN(BOAG)=(LOB)OE (1.7) 
Two sets are said to be disjoint if they have no elements in common, i.e. if 
AVB=O6 (1.8) 
The operations of union and intersection are related by the following distributive laws: 


AV(BUE)=(LHOB)I(AOFG) 
(1.9) 
AH|(BOG)=(LHVUB)A(H VG) 
The complement of the set Z with respect to the set æ is the set of all elements contained in £ but 
not contained in#. The complement of Z with respect to the set £ is denoted by £ / 2 and is 
specified by 
A/\B={a\aex#v anda¢g } (1.10) 


It is easy to show that 
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AAIDD =Z for BoA (1.11) 
and 


AIB=A>AVB=D (1.12) 


Exercises 


1.1 List all possible subsets of the set {a,b, c,d y 


1.2 List two proper subsets of the set {a,b}. 


1.3 Prove the following formulas: 
(a) AZ=AIBSB CAA. 
(b) A=A IO. 
(c) A=AIA. 
1.4 Show that %n#=F. 
1.5 Verify the distributive laws (1.9). 
1.6 Verify the following formulas: 
(a) A BSANB=A. 
(b) AV =O. 
(c) AVA =A. 
1.7 Give a proof of the commutative and associative properties of the intersection operation. 
18 Let £ ={-1,-2,-3,-4}, #={-1,0,1,2,3,7}, ¢={0}, and 9 = {-7,-5,-3,-1,1,2,3}. List 


the elements of ZU4,L0B,AVE,BOE,AVBIE,AIB,and (D/@)H. 
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Section 2. Ordered Pairs, Cartesian Products, and Relations 


The idea of ordering is not involved in the definition of a set. For example, the set {a,b} is 
equal to the set {b,a}. In many cases of interest it is important to order the elements of a set. To 
define an ordered pair (a,b) we single out a particular element of {a,b}, say a , and define (a,b) 
to be the class of sets; in equation form, 


(a,b) ={{a},{a,b}} (2.1) 
This ordered pair is then different from the ordered pair (b,a) which is defined by 

(b,a) = {{b}, {b,a}} (2.2) 
It follows directly from the definition that 


(a,b) =(c,d)<>a=c andb=d 
Clearly, the definition of an ordered pair can be extended to an ordered N -tuple (a,,d,,...,d,). 


The Cartesian Product of two sets £ and Z is a set whose elements are ordered pairs (a,b) 
where ais an element of and bis an element of @. We denote the Cartesian product of 
Aand Zby Lx. 


AxB ={(a,b)ļa € X,be B} (2.3) 


It is easy to see how a Cartesian product of N sets can be formed using the notion of an ordered N- 
tuple. 


Any subset 4 of # xZ defines a relation from the set æ to the set Z. The notation a% b is 
employed if (a,b) € #; this notation is modified for negation as a% b if (a,b) g X. Asan 
example of a relation let be the set of all Volkswagons in the United States and let # be the set 
of all Volkswagons in Germany. Let be X c £ xZ be defined so that axb if bis the same color 
as a. 


If £ = Z , then # is a relation on.. Such a relation is said to be reflexive if aa for all 
ae, itis said to be symmetric if a#b whenever ba for any aand bin #&, and it is said to be 
transitive if a#b and b#c imply that a#c for any a,band cin £. A relation ona set s is said 
to be an equivalence relation if it is reflexive, symmetric and transitive. As an example of an 
equivalence relation % on the set of all real numbers Zlet axb a =b forall a,b. To verify that 
this is an equivalence relation, note that a = a for each a e & (reflexivity), a = b =b =a for each 
ae X (symmetry), and a = b,b = c >a =c for a,band cin & (transitivity). 
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A relation % on # is antisymmetric if a#bthen ba implies b=a. A relation on a set 
& is said to be a partial ordering if it is reflexive, antisymmetric and transitive. The equality 
relation is a trivial example of partial ordering. As another example of a partial ordering on #, let 
the inequality <be the relation. To verify that this is a partial ordering note that a < a for every 
a € & (reflexivity), a < band b < a => a =b for all a,b € &(antisymmetry), and a < band 
b<c=>a<cforall a,b,c € 2 (transitivity). Of course inequality is not an equivalence relation 
since it is not symmetric. 


Exercises 
2.1 Define an ordered triple (a,b,c) by 
(a,b,c) = ((a,b),c) 
and show that (a,b,c) =(d,e, f)<:a=d,b=e,andc=f. 


2.2 Let æ bea setand # an equivalence relation on. . For each a € £ consider the set of all 
elements x that stand in the relation # to a; this set is denoted by {x [a Xx}, and is called 
an equivalence set. 


(a) Show that the equivalence set of a contains a. 
(b) Show that any two equivalence sets either coincide or are disjoint. 


Note: (a) and (b) show that the equivalence sets form a partition of £ ; s£ is the disjoint 
union of the equivalence sets. 


2.3 Onthe set of all real numbers does the strict inequality < constitute a partial ordering on 
the set? 


2.4 For ordered triples of real numbers define (a,,b,,c,) <(a,,b,,c,) if a, <a,,b, < b,c, Sc. 
Does this relation define a partial ordering on 2x # x Z? 
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Section 3. Functions 


A relation f from % to 9 is said to be a function (or a mapping) if (x, y,)€ f and 
(x, y,) € f imply y, =y, for allxe 2. A function is thus seen to be a particular kind of relation 
which has the property that for each x € X there exists one and only one y €e Y such that 
(x,y) ¢ f . Thus a function f defines a single-valued relation. Since a function is just a 
particular relation, the notation of the last section could be used, but this is rarely done. A standard 


notation for a function from 2% to Wis 
fF :X >Y 
or 


y= f(x) 


which indicates the element y €e % that f associates with xe 2. The element y is called the value 
of f at x. The domain of the function f is the set 2. The range of f is the set of all y for 
which there exists an x such that y = f (x). We denote the range of f by f(2). When the domain 
of f is the set of real numbers #, f is said to be a function of a real variable. When f (2) is 
contained in &, the function is said to be real-valued. Functions of a complex variable and 
complex- valued functions are defined analogously. A function of a real variable need not be real- 
valued, nor need a function of a complex variable be complex-valued. 


If Xis a subset of 2,2, c X ,and f: X — Y,the image of % under f is the set 
f(M)={ fo |x|e%} (3.1) 
and it is easy to prove that 
FRE FY) (3.2) 
Similarly, if% is a subset of Y, then the preimage of Y under f is the set 
F(A) ={x| fe M} (3.3) 
and it is easy to prove that 


CAKE (3.4) 
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A function f is said be into Yif f(X) is a proper subset of Y, and it is said to be onto % if 
fF(X)=2. A function that is onto is also called surjective. Stated in a slightly different way, a 

function is onto if for every element y € ¥ there exists at least one element x € 2 such that 

y= f(x). A function f is said to be one-to-one (or injective) if for every x,,x, € X 


f(x) = f(x) > X, =X, (3.5) 


In other words, a function is one-to-one if for every element y € f (2°) there exists only one 
element x€ 2 such that y = f(x). A function f is said to form a one-to-one correspondence (or 
to be bijective) from X toV if f(%)=Y, and f is one-to-one. Naturally, a function can be onto 
without being one-to-one or be one-to-one without being onto. The following examples illustrate 
these terminologies. 


1. Let f(x)=x° bea particular real valued function of a real variable. This function is 
one-to-one because x” = z’ implies x = z when x,z € Zand it is onto since for every 
y € P there is some x such that y = x°. 

2. Let f(x) = x° be a particular complex-valued function of a complex variable. This 
function is onto but it is not one-to-one because, for example, 


By pt 8 


dp ANON ay oe ae 
LOS ata dk ee ae 


where i = I, 


3. Let f(x) =2|x| be a particular real valued function of a real variable where |x| denotes 
the absolute value of x. This function is into rather than onto and it is not one-to-one. 


If f: X — V% isa one-to-one function, then there exists a function f™: f(X) > Z which 
associates with each y e f(X) the unique xe % such that y= f(x). We call f™ the inverse of 
f , written x = f ‘(y). Unlike the preimage defined by (3.3), the definition of inverse functions is 
possible only for functions that are one-to-one. Clearly, we have f '(f(*))=%. The 
composition of two functions f : X —> Wand g:> Y > #isa function h: X + Z defined by 
h(x) = g(f(x)) for all xe 2. The function h is written h = go f . The composition of any finite 


number of functions is easily defined in a fashion similar to that of a pair. The operation of 
composition of functions is not generally commutative, 


gof#feog 


Indeed, if go f is defined, f o g may not be defined, and even if go f and f ° g are both defined, 
they may not be equal. The operation of composition is associative 
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ho(g° f)=(h°g)° f 


if each of the indicated compositions is defined. The identity function id : 2 — % is defined as 
the function id(x) = x forall xe 2%. Clearly if f is a one-to-one correspondence from % to Y, 
then 


f~ o f =id,, f ° f” =id, 
where id, and id, denote the identity functions of 2 and 4 , respectively. 


To close this discussion of functions we consider the special types of functions called 
sequences. A finite sequence of N terms is a function f whose domain is the set of the first 


N positive integers {1,2,3,...,N}. The range of f is the set of N elements { f (1), f(2),.... f(N)}, 
which is usually denoted by { fetsat: w} . The elements f,, f»,--- fy of the range are called terms 


of the sequence. Similarly, an infinite sequence is a function defined on the positive integers s”. 
The range of an infinite sequence is usually denoted by { fa} , which stands for the infinite set 


{ fo fz» faa . A function g whose domain is .%* and whose range is contained in .¥" is said to be 


order preserving if m <n implies that g(m) < g(n) forall m,ne %*. If fis an infinite sequence 
and gis order preserving, then the composition f og is a subsequence of f . For example, let 


fa =1/(n+)), g, =4" 
Then 
f ° gm) =1/(4" +1) 


is a subsequence of f. 


Exercises 


3.1 Verify the results (3.2) and (3.4). 

3.2 How may the domain and range of the sine function be chosen so that it is a one-to-one 
correspondence? 

3.3 Which of the following conditions defines a function y = f(x)? 


x +y =1, xy=1, x,y € F. 


3.4 Give an example of a real valued function of a complex variable. 
3.5 Can the implication of (3.5) be reversed? Why? 
3.6 Under what conditions is the operation of composition of functions commutative? 
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3.7 Show that the arc sine function sin ‘really is not a function from Zto #. How can it be 
made into a function? 


Chapter 2 


GROUPS, RINGS, AND FIELDS 


The mathematical concept of a group has been crystallized and abstracted from numerous 
situations both within the body of mathematics and without. Permutations of objects in a set is a 
group operation. The operations of multiplication and addition in the real number system are group 
operations. The study of groups is developed from the study of particular situations in which 
groups appear naturally, and it is instructive that the group itself be presented as an object for 
study. In the first three sections of this chapter certain of the basic properties of groups are 
introduced. In the last section the group concept is used in the definition of rings and fields. 


Section 4. The Axioms for a Group 


Central to the definition of a group is the idea of a binary operation. If Y is a non-empty 
set, a binary operation on ¥ is a function from ¢x¢ to Y If a,b e GY, the binary operation 
will be denoted by * and its value bya*b. The important point to be understood about a binary 
operation on ¥ is that Y is closed with respect to * in the sense thatifa,beY then a*be¥ also. 
A binary operation *on Y is associative if 


(a*b)*c=a*(b*c) foralla,b,ce¥Y (4.1) 


Thus, parentheses are unnecessary in the combination of more than two elements by an associative 
operation. A semigroup is a pair (Y,*) consisting of a nonempty set Y with an associative binary 


operation *. A binary operation *on Y is commutative if 


a*b=b*a _ foralla,sbeY (4.2) 


The multiplication of two real numbers and the addition of two real numbers are both examples of 
commutative binary operations. The multiplication of two N by N matrices (N > 1) isa 
noncommutative binary operation. A binary operation is often denoted by aob, a-b, or ab rather 
than bya «b; if the operation is commutative, the notation a+b is sometimes used in place of 
a*b. 


An element e € & that satisfies the condition 
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e*a=a*e=a for all a € 9 (4.3) 


is called an identity element for the binary operation * on the set Y. In the set of real numbers 
with the binary operation of multiplication, it is easy to see that the number 1 plays the role of 
identity element. In the set of real numbers with the binary operation of addition, 0 has the role of 
the identity element. Clearly % contains at most one identity element ine. For if e'is another 
identity element in Y, then e' *a=a*e' =a forall a e G also. In particular, if we choose a=e, 
then e' *e =e. But from (4.3), we have also e' *e =e’. Thus e' =e. In general, Y need not 
have any identity element. But if there is an identity element, and if the binary operation is 
regarded as multiplicative, then the identity element is often called the unity element; on the other 
hand, if the binary operation is additive, then the identity element is called the zero element. 


In a semigroup ¥ containing an identity element e with respect to the binary operation *, 
an element a™ is said to be an inverse of the element a if 


a*a'=a'*a=e (4.4) 


In general, a need not have an inverse. But if an inverse a™ of a exists, then it is unique, the 
proof being essentially the same as that of the uniqueness of the identity element. The identity 


element is its own inverse. In the set #/ {0} with the binary operation of multiplication, the 


inverse of a number is the reciprocal of the number. In the set of real numbers with the binary 
operation of addition the inverse of a number is the negative of the number. 


A group is a pair (¥,*) consisting of an associative binary operation * and a set Y which 


contains the identity element and the inverses of all elements of Y with respect to the binary 
operation*. This definition can be explicitly stated in equation form as follows. 


Definition. A group is a pair (4, *) where Y is a set and * is a binary operation satisfying the 


following: 
(a) (a*b)*c=a*(b*c) forall a,b,ceY. 
(b) There exists an element e € Ysuch that a*e=e*a =a forall aeg. 


(c) For every a € G there exists an element a™ e G such that a*a™ =a" *a =e. 


If the binary operation of the group is commutative, the group is said to be a commutative (or 
Abelian) group. 
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The set #/ {0} with the binary operation of multiplication forms a group, and the set # 


with the binary operation of addition forms another group. The set of positive integers with the 
binary operation of multiplication forms a semigroup with an identity element but does not form a 
group because the condition © above is not satisfied. 


A notational convention customarily employed is to denote a group simply by ¥ rather than 
by the pair (9, *) . This convention assumes that the particular * to be employed is understood. 
We shall follow this convention here. 


Exercises 


4.1 Verify that the set Y e {1,i,-1,-i}, where i? =-—1, is a group with respect to the binary 
operation of multiplication. 


4.2 Verify that the set Y consisting of the four 2 x 2 matrices 


o1 


constitutes a group with respect to the binary operation of matrix multiplication. 


4.3 Verify that the set ¥ consisting of the four 2x2 matrices 


out [oak bat Lo 4 


constitutes a group with respect to the binary operation of matrix multiplication. 


4.4 Determine a subset of the set of all 3x 3matrices with the binary operation of matrix 
multiplication that will form a group. 


4.5 If æ isa set, show that the set 4 , Y ={ f | f : £ —> £, f is a one-to-one correspondence} 


is a one-to-one correspondence }constitutes a group with respect to the binary operation of 
composition. 
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Section 5. Properties of a Group 


In this section certain basic properties of groups are established. There is a great economy 
of effort in establishing these properties for a group in general because the properties will then be 
possessed by any system that can be shown to constitute a group, and there are many such systems 
of interest in this text. For example, any property established in this section will automatically hold 
for the group consisting of #/ {0} with the binary operation of multiplication, for the group 


consisting of the real numbers with the binary operation of addition, and for groups involving 
vectors and tensors that will be introduced in later chapters. 


The basic properties of a group Y in general are: 
1. The identity element e e is unique. 


2. The inverse element a of any element a € Y is unique. The proof of Property 1 
was given in the preceding section. The proof of Property 2 follows by essentially the same 
argument. 


If n is a positive integer, the powers of a € 4 are defined as follows: (i) For n=1,a' =a. 
(ii) For n>1,a" =a"" *a, (iii) a° =e. (iv) a" =(a")’. 


3. If m,n,k are any integers, positive or negative, then for a € 4 
m n m+n myn mn m+n k mk+nk 
a" xa" =a, (a) =a", (a ) =a 


In particular, when m =n = —1, we have: 


4. Ga =a forall a e% 


5. (axb) | =b" +a™ forall a,b € 9 


For square matrices the proof of this property is given by (0.17); the proof in general is exactly the 
same. 


The following property is a useful algebraic rule for all groups; it gives the solutions x and 
y tothe equations x*a =b and a* y =b, respectively. 


6. For any elements a,bin Y, the two equations x*a =b and a* y =b have 


the unique solutions x =b*a™' and y =a™ *b. 
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Proof. Since the proof is the same for both equations, we will consider only the equation 
x*a=b. Clearly x = b %*a™ satisfies the equation x *a = b, 


x*a =(b*a')*a =b*(at *a)=b*e=b 
Conversely, x *a = b, implies 
x=x*e=x*(a*a")=(x#a)*a7 =b*q' 


which is unique. As a result we have the next two properties. 


7. For any three elements a,b,c in Y, either a*c=b*c or c*a=c*b implies that 
a=b. 


8. For any two elements a,b in the group Y, either a*b=b or b*a =b implies that 
a is the identity element. 


A non-empty subset ¥' of Y is a subgroup of 4 if Y' is a group with respect to the binary 
operation of 4 , i.e., Y' is a subgroup of Y if and only if (i) eeY' (ii) acY' >a €Y ' (iii) 
abeY'>artbeGY'. 


9. Let Y' be a nonempty subset of Y. Then Y' is a subgroup if 
a,beY'>a*b'&G'. 


Proof. The proof consists in showing that the conditions (i), (ii), (iii) of a subgroup are satisfied. 
(i) Since Y' is non-empty, it contains an element a, hence a *a™ =e&@%’. (ii) If beY' then 


exb' =b" 9". (iii) If abe 9", then a*(b") =a*be9". 
If Y is a group, then ¥ itself is a subgroup of Y, and the group consisting only of the element e is 


also a subgroup of ¥. A subgroup of ¥ other than ¥ itself and the group e is called a proper 
subgroup of ¥. 


10. The intersection of any two subgroups of a group remains a subgroup. 


Exercises 


5.1 Show that e is its own inverse. 


5.2 Prove Property 3. 
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5.3 
5.4 
5.5 
5.6 
5.7 
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Prove Properties 7 and 8. 

Show that x = y in Property 6 if Y is Abelian. 

If Y isa group and a e 9 show that a *a =a implies a =e. 
Prove Property 10. 


Show that if we look at the non-zero real numbers under multiplication, then (a) the rational 
numbers form a subgroup, (b) the positive real numbers form a subgroup, (c) the irrational 
numbers do not form a subgroup. 
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Section 6. Group Homomorphisms 


A “group homomorphism” is a fairly overpowering phrase for those who have not heard it 
or seen it before. The word homomorphism means simply the same type of formation or structure. 
A group homomorphism has then to do with groups having the same type of formation or structure. 


Specifically, if % and # are two groups with the binary operations * and o, respectively, 
a function f :¥— Æ is a homomorphism if 


f (a*b)= f (a)o f (b) for all a,b € 9 (6.1) 


If a homomorphism exists between two groups, the groups are said to be homomorphic. A 
homomorphism f :¥— # is an isomorphism if f is both one-to-one and onto. If an 


isomorphism exists between two groups, the groups are said to be isomorphic. Finally, a 
homomorphism f : ¥ > @ is called an endomorphism and an isomorphism f :¥ > @ is called an 


automorphism. To illustrate these definitions, consider the following examples: 


1. Let Y be the group of all nonsingular, real, N xN matrices with the binary operation of 
matrix multiplication. Let # be the group #/ {0} with the binary operation of scalar 


multiplication. The function that is the determinant of a matrix is then a homomorphism 


from Y to Æ. 

2 Let Y be any group and let # be the group whose only element ise. Let f be the 
function that assigns to every element of 4 the value e; then f is a (trivial) 
homomorphism. 


The identity function id: ¥ — 4 is a (trivial) automorphism of any group. 


4. Let Y be the group of positive real numbers with the binary operation of multiplication and 
let # be the group of real numbers with the binary operation of addition. The log function 
is an isomorphism between ¥ and #. 


5. Let Y be the group #/ {0} with the binary operation of multiplication. The function that 


takes the absolute value of a number is then an endomorphism of ¥ into 4 . The restriction 
of this function to the subgroup # of # consisting of all positive real numbers is a 
(trivial) automorphism of # , however. 


From these examples of homomorphisms and automorphisms one might note that a 
homomorphism maps identities into identities and inverses into inverses. The proof of this 
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observation is the content of Theorem 6.1 below. Theorem 6.2 shows that a homomorphism takes 
a subgroup into a subgroup and Theorem 6.3 proves a converse result. 


Theorem 6.1. If f: —> # is a homomorphism, then f (e) coincides with the identity element 
e,of # and 


Proof. Using the definition (6.1) of a homomorphism on the identity a = a * e, one finds that 


f(a)= f(a*e)= f (a)° f (e) 


From Property 8 of a group it follows that f (e) is the identity element e, of #. Now, using this 


1 


result and the definition of a homomorphism (6.1) applied to the equation e=a*a_, we find 


@ = F(e)= f(axa")= F(a)e F(a") 


It follows from Property 2 that f (a`) is the inverse of f (a). 


Theorem 6.2. If f :Y— # is a homomorphism and if ¥' is a subgroup of ¥, then f(Ẹ') isa 
subgroup of #. 


Proof. The proof will consist in showing that the set f(Y') satisfies the conditions (i), (ii), (iii) of 


a subgroup. (i) Since e €Y' it follows from Theorem 6.1 that the identity element e, of # is 
contained in f(Y'). (ii)Forany ae¥', f(a) € f(Y') by Theorem 6.1. (iii) For any a,be9', 
f (a)o f (b)e f(Y')since f (a)o f(b)= f(a*b)e fF). 


As a corollary to Theorem 6.2 we see that f (9%) is itself a subgroup of #. 


Theorem 6.3. If f :Y— # is a homomorphism and if #'is a subgroup of # , then the 
preimage f '(#')is a subgroup of 4 . 


The kernel of a homomorphism f :¥— # is the subgroup f~ (e,)of Y. In other words, 
the kernel of f is the set of elements of Y that are mapped by f to the identity element e, of #. 
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The notation K( f ) will be used to denote the kernel of f . In Example 1 above, the kernel of f 
consists of N x N matrices with determinant equal to the real number 1, while in Example 2 it is 
the entire group 4 . In Example 3 the kernel of the identity map is the identity element, of course. 


Theorem 6.4. A homomorphism f : 4 —> # is one-to-one if and only if K(f ) = {e}. 


Proof. This proof consists in showing that f (a)= f (b) implies a = bif and only if K(f )= {e}. 
If f(a)= f(b), then f(a)o f(b) =e, hence f(a *b") =e, and it follows that 
a*b'eK(f). Thus, if K(f)={e},then a*b' =e or a=b. Conversely, now, we assume f 
is 

one-to-one and since K( f ) is a subgroup of ¥ by Theorem 6.3, it must contain e. If 

K( f ) contains any other element a such that f (a) =e, we would have a contradiction since f is 


one-to-one; therefore K ( f ) = {e}. 


Since an isomorphism is one-to-one and onto, it has an inverse which is a function from # 
onto ¥. The next theorem shows that the inverse is also an isomorphism. 


Theorem 6.5. If f :¥— Æ is an isomorphism, then f ': # — Y is an isomorphism. 


Proof Since f is one-to-one and onto, it follows that f ‘is also one-to-one and onto (cf. Section 
3). Let a,be the element of # such that a= f~'(a,) forany aeY; then 

a*b=f '(a,)*f '(b,). But a*bis the inverse image of the element a, ob, € # because 

f (a*b) = f (a) f(b) =a, *b, since f isahomomorphism. Therefore 

f~ (a,)* f -'(b,) = f> (a) °.b,), which shows that f~ satisfies the definition (6.1) of a 


homomorphism. 


Theorem 6.6. A homomorphism f :¥— # is an isomorphism if it is onto and if its kernel 
contains only the identity element of  . 


The proof of this theorem is a trivial consequence of Theorem 6.4 and the definition of an 
isomorphism. 
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Exercises 


61 If f:¥—> #and g:# — M are homomorphisms, then show that the composition of the 
mappings f and g is a homomorphism from ¥ to M. 


6.2 Prove Theorems 6.3 and 6.6. 


6.3 Show that the logarithm function is an isomorphism from the positive real numbers under 
multiplication to all the real numbers under addition. What is the inverse of this 
isomorphism? 

6.4 Show that the function f : f :(2,+) > (2, .) defined by f (x)= x? is nota 


homomorphism. 


Sec. 7 e Rings and Fields 33 


Section 7. Rings and Fields 


Groups and semigroups are important building blocks in developing algebraic structures. In 
this section we shall consider sets that form a group with respect to one binary operation and a 
semigroup with respect to another. We shall also consider a set that is a group with respect to two 
different binary operations. 


Definition. A ring is a triple (9,+,-) consisting of a set 2 and two binary operations + and - such 
that 


(a) 2 with the operation + is an Abelian group. 
(b) The operation - is associative. 


(c) 9 contains an identity element, denoted by 1, with respect to the operation -, i.e., 


1-a=a-l=a 


forallaecGY. 


(d) The operations + and - satisfy the distributive axioms 


a-(b+c)=a-b+a-c Gi 
(b+c)-a=b-a+c-a l 


The operation + is called addition and the operation - is called multiplication. As was done with 
the notation for a group, the ring (9,+,-) will be written simply as 2. Axiom (a) requires that 2 
contains the identity element for the + operation. We shall follow the usual procedure and denote 
this element by 0. Thus, a+0=0+a=a. Axiom (a) also requires that each element a € 9 have 
an additive inverse, which we will denote by —a , a+(—a)=0. The quantity a+ bis called the 


sum of a and b, and the difference between a and b is a+(—b), which is usually written as 


a-b. Axiom (b) requires the set 2 and the multiplication operation to form a semigroup. If the 
multiplication operation is also commutative, the ring is called a commutative ring. Axiom (c) 
requires that 2 contain an identity element for multiplication; this element is called the unity 
element of the ring and is denoted by 1. The symbol 1 should not be confused with the real number 
one. The existence of the unity element is sometimes omitted in the ring axioms and the ring as we 
have defined it above is called the ring with unity. Axiom (d), the distributive axiom, is the only 
idea in the definition of a ring that has not appeared before. It provides a rule for the interchanging 
of the two binary operations. 
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The following familiar systems are all examples of rings. 


1. The set ¥ of integers with the ordinary addition and multiplication operations form a ring. 
2 The set & of rational numbers with the usual addition and multiplication operations form a 
ring. 


The set & of real numbers with the usual addition and multiplication operations form a ring. 


4. The set ¢ of complex numbers with the usual addition and multiplication operations form a 
ring. 
5. The set of all N by N matrices form a ring with respect to the operations of matrix addition 


and matrix multiplication. The unity element is the unity matrix and the zero element is the 
zero matrix. 


6. The set of all polynomials in one (or several) variable with real (or complex) coefficients 
form a ring with respect to the usual operations of addition and multiplication. 


Many properties of rings can be deduced from similar results that hold for groups. For 
example, the zero element 0, the unity element 1, the negative element —a of an element a are 
unique. Properties that are associated with the interconnection between the two binary operations 
are contained in the following theorems: 


Theorem 7.1. For any element a € 2, a-0=0-a=0. 


Proof. For any b €e 2 we have by Axiom (a) that b+0=b. Thus for any a e€ Zit follows that 
(b+0)-a=b-a, and the Axiom (d) permits this equation to be recast in the form b-a+0-a=b-a. 
From Property 8 for the additive group this equation implies that 0-a =0. It can be shown that 
a-0=Oby a similar argument. 


Theorem 7.2. For all elements a,b € 2 


(-a)-b=a-(-b)=-(a-b) 


This theorem shows that there is no ambiguity in writing —a -b for -(a -b). 


Many of the notions developed for groups can be extended to rings; for example, subrings 
and ring homomorphisms correspond to the notions of subgroups and group homomorphisms. The 
interested reader can consult the Selected Reading for a discussion of these ideas. 


The set of integers is an example of an algebraic structure called an integral domain. 
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Definition. A ring 2is an integral domain if it satisfies the following additional axioms: 


(e) The operation - is commutative. 


(f) If a,b,c are any elements of 9 with c #0, then 


a-c=b-c>a=b (7.2) 


The cancellation law of multiplication introduced by Axiom (f) is logically equivalent to the 
assertion that a product of nonzero factors is nonzero. This is proved in the following theorem. 


Theorem 7.3. A commutative ring 2 is an integral domain if and only if for all elements a,be 2, 
a-b#0 unless a=0 or b=0. 


Proof. Assume first that 9 is an integral domain so the cancellation law holds. Suppose a-b = 0 
and b#0. Then we can write a-b=0- band the cancellation law implies a =0. Conversely, 
suppose that a product in a commutative ring 2 cannot be zero unless one of its factors is zero. 


Consider the expression a-c = b-c, which can be rewritten as a-c—b-c=Ooras (a = b) -c=0. 


If c #0 then by assumption a—b=Oor a =b. This proves that the cancellation law holds. 


The sets of integers ¥ , rational numbers &, real numbers #, and complex numbers ¢ are 
examples of integral domains as well as being examples of rings. Sets of square matrices, while 
forming rings with respect to the binary operations of matrix addition and multiplication, do not, 
however, form integral domains. We can show this in several ways; first by showing that matrix 
multiplication is not commutative and, second, we can find two nonzero matrices whose product is 


zero, for example, 
5 0/0 0| |O O 
o ojlo 1] lo 0 


The rational numbers, the real numbers, and the complex numbers are examples of an 
algebraic structure called a field. 


Definition. A field Z is an integral domain, containing more than one element, and such that any 
element a € F /{0}has an inverse with respect to multiplication. 
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It is clear from this definition that the set F and the addition operation as well as the set 
FI {0} and the multiplication operation form Abelian groups. Hence the unity element 1, the zero 


element 0, the negative (—a), as well as the reciprocal (1/ a),a #0, are unique. The formulas of 


arithmetic can be developed as theorems following from the axioms of a field. Itis not our purpose 
to do this here; however, it is important to convince oneself that it can be done. 


The definitions of ring, integral domain, and field are each a step more restrictive than the 
other. Itis trivial to notice that any set that is a field is automatically an integral domain and a ring. 
Similarly, any set that is an integral domain is automatically a ring. 


The dependence of the algebraic structures introduced in this chapter is illustrated 
schematically in Figure 7.1. The dependence of the vector space, which is to be introduced in the 
next chapter, upon these algebraic structures is also indicated. 


Semigroup 


Group 


| 


Ring = <——__________—__ Abelian group 


| 


Commutative ring 


l 


Integral domain 


| 


Field 


ea 


Vector space 


Figure 1. A schematic of algebraic structures. 
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Exercises 


7.1 Verify that Examples 1 and 5 are rings. 

7.2 Prove Theorem 7.2. 

7.3 Prove that if a,b,c,d are elements of a ring 9, then 
(a) (—a)-(—b) =a-b. 
b)  (-1)-a=a-(-1)=-a. 
(c) (a+b)-(c+d)=a-c+a-d+b-c+b-d. 


7.4 Among the Examples 1-6 of rings given in the text, which ones are actually integral 
domains, and which ones are fields? 

7.5 Is the set of rational numbers a field? 

7.6 Why does the set of integers not constitute a field? 


7.7 If F isa field, why is F /{0} an Abelian group with respect to multiplication? 


7.8 Show that for all rational numbers x, y , the set of elements of form x + yv2 constitutes a 
field. 


7.9 For all rational numbers x, y, z, does the set of elements of the form x + yJ2 + 2N3 form a 
field? If not, show how to enlarge it to a field. 
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Chapter 3 


VECTOR SPACES 


Section 8. The Axioms for a Vector Space 


Generally, one’s first encounter with the concept of a vector is a geometrical one in the 
form of a directed line segment, that is to say, a straight line with an arrowhead. This type of 
vector, if it is properly defined, is a special example of the more general notion of a vector 
presented in this chapter. The concept of a vector put forward here is purely algebraic. The 
definition given for a vector is that it be a member of a set that satisfies certain algebraic rules. 

A vector space is a triple (x „F, f ) consisting of (a) an additive Abelian group % , (b) a 
field F , and (c) a function f :F x% —>%Y called scalar multiplication such that 


(8.1) 


forall 24, ue F andall u,v €% . A vector is an element of a vector space. The notation 
(% ae f ) for a vector space will be shortened to simply % . The first of (8.1) is usually called the 


associative law for scalar multiplication, while the second and third equations are distributive laws, 
the second for scalar addition and the third for vector addition. 


It is also customary to use a simplified notation for the scalar multiplication function f. 
We shall write f (A,v) =Av and also regard Av and vA to be identical. In this simplified notation 


we shall now list in detail the axioms of a vector space. In this definition the vector u+v in V is 
called the sum of u and v and the difference of u and v is written u -— v and is defined by 


u—v=u+(-v) (8.2) 


Definition. Let ⁄ beaset and F afield. Y is a vector space if it satisfies the following rules: 


(a) There exists a binary operation in Y called addition and denoted by + such that 


41 
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(1) (u+v)+w=u+(v+w) forall u,v,w €% . 

(2) u+v=v+u forall uve% . 

(3) There exists an element 0 €% such that u+0=u forall u €%. 

(4) For every u €% there exists an element -u €%.such that u+(—u)=0. 
(b) There exists an operation called scalar multiplication in which every scalar 2 € F 
can be combined with every element u €% to give an element Au €% such that 


(1) A(uu)=(Au)u 
(2) (A+ w)u= Aut wu 
(3) A(u+v)=Au+t Av 


(4) lu=u 
for all 24, u eF and all u,v €% . 


If the field F employed in a vector space is actually the field of real numbers #, the space is 
called a real vector space. A complex vector space is similarly defined. 


Except for a few minor examples, the material in this chapter and the next three employs 
complex vector spaces. Naturally the real vector space is a trivial special case. The reason for 
allowing the scalar field to be complex in these chapters is that the material on spectral 
decompositions in Chapter 6 has more usefulness in the complex case. After Chapter 6 we shall 
specialize the discussion to real vector spaces. Therefore, unless we provide some qualifying 
statement to the contrary, a vector space should be understood to be a complex vector space. 


There are many and varied sets of objects that qualify as vector spaces. The following is a 
list of examples of vector spaces: 


1. The vector space ¢” is the set of all N-tuples of the form u= (A, Pere ), where Nisa 
positive integer and 4 ,⁄, ,...,Ay €@. Since an N-tuple is an ordered set, if 


v= ( Lh, 4h dene, Hy) is a second N-tuple, then uand v are equal if and only if 


U, =A, forall k =1,2,..., N 


The zero N-tuple is 0=(0,0,...,0) and the negative of the N-tuple u is 


u= ( Ay Ay E Ax) . Addition and scalar multiplication of N-tuples are defined by the 


formulas 


u+v=(4, Aa fh +À, ,... lly + Ay) 
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and 


pu = ( HA, , HÀ yoy HÀ ) 


respectively. The notation ¢” is used for this vector space because it can be considered to 
be an Nth Cartesian product of ¢ . 


The set Y of all NxM complex matrices is a vector space with respect to the usual 


operation of matrix addition and multiplication by a scalar. 
Let # be a vector space whose vectors are actually functions defined on a set £ with 


values in @. Thus, if he #, xex then h(x)e¢ and h:¥ >¢@. If kis another vector of 
# then equality of vectors is defined by 


h=k @h(x)=k(x) forall xex 


The zero vector is the zero function whose value is zero for all x. Addition and scalar 
multiplication are defined by 


(h+k)(x)=h(x)+k(x) 


and 
(Ah)(x)=2(h(x)) 
respectively. 


Let # denote the set of all polynomials u of degree N of the form 


U=A, +HAXHAX $e tAy x” 
where J, ,A4,,4,,...4y EZ. The set # forms a vector space over the complex numbers ¢ if 
addition and scalar multiplication of polynomials are defined in the usual way. 


The set of complex numbers %, with the usual definitions of addition and multiplication by 
a real number, forms a real vector space. 
The zero element 0 of any vector space forms a vector space by itself. 


The operation of scalar multiplication is not a binary operation as the other operations we 


have considered have been. In order to develop some familiarity with the algebra of this operation 
consider the following three theorems. 
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Theorem 8.1. 2u=0&4=0 oru=0. 


Proof. The proof of this theorem actually requires the proof of the following three assertions: 
(a) Ou=0, (b)A0=0, (c)AU=0>A1=0oru=0 


To prove (a), take 44=0 in Axiom (b2) for a vector space; then Au = Au+Ou. Therefore 


Au —Au = Au — Au + Ou 


and by Axioms (a4) and (a3) 
0 = 0 + 0u = Ou 
which proves (a). 
To prove (b), set v =0 in Axiom (b3) for a vector space; then Au = Au +40. Therefore 


Au—Au= Au-—Aut+ 20 


and by Axiom (a4) 


0=0+20=20 


To prove (c), we assume Au = 0. If 2 =0, we know from (a) that the equation Au = 0 is 
satisfied. If A #0, then we show that u must be zero as follows: 


Theorem 8.2. (—2)u=—Au forall A<e@,ueY. 


Proof. Let u= 0 and replace 2 by —/ in Axiom (b2) for a vector space and this result follows 
directly. 


Theorem 8.3. —Au=A(-u) forall Ac ¢,ueY. 
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Finally, we note that the concepts of length and angle have not been introduced. They will 
be introduced in a later section. The reason for the delay in their introduction is to emphasize that 
certain results can be established without reference to these concepts. 


Exercises 


8.1 Show that at least one of the axioms for a vector space is redundant. In particular, one 
might show that Axiom (a2) can be deduced from the remaining axioms. [Hint: expand 


(1+1) (u+ v) by the two different rules.] 


8.2 Verify that the sets listed in Examples 1- 6 are actually vector spaces. In particular, list the 
zero vectors and the negative of a typical vector in each example. 

8.3 Show that the axioms for a vector space still make sense if the field F is replaced by a ring. 
The resulting structure is called a module over a ring. 

8.4 Show that the Axiom (b4) for a vector space can be replaced by the axiom 


(b4) Au=02=0oru=0 


8.5 Let ¥ and &be vector spaces. Show that the set Y x Y is a vector space with the 
definitions 


(u,x)+(v,y)=(u+v,x+y) 
and 


A(u,x) = (Au, Ax) 


where uve% ; x yeV;jandiceF 


8.6 Prove Theorem 8.3. 
8.7 Let YW bea vector space and consider the set ¥ x ¥. Define addition in YW x V by 


(u,v) +(x, y)=(u+x,vt+y) 
and multiplication by complex numbers by 
(A+in) (u,v) = (Au-— uy, pu + Av) 


where A, 46%. Show that Y x% is a vector space over the field of complex numbers. 
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Section 9. Linear Independence, Dimension, and Basis 


The concept of linear independence is introduced by first defining what is meant by linear 
dependence in a set of vectors and then defining a set of vectors that is not linearly dependent to be 
linearly independent. The general definition of linear dependence of a set of N vectors is an 
algebraic generalization and abstraction of the concepts of collinearity from elementary geometry. 


Definition. A finite set of N (N 21) vectors {v,, V,,..., Vy }in a vector space Y is said to be 


linearly dependent if there exists a set of scalars {a a hs not all zero, such that 


Aiv. =0 9.1 
24y; (9.1) 


The essential content of this definition is that at least one of the vectors {v,,Vv,,...,V, } can 
be expressed as a linear combination of the other vectors. This means that if 2' #0, then 

N j j j ; ; ; 
v= FS WY ,where w =—A!/ A‘ for j =2,3,...,N. As a numerical example, consider the two 


vectors 
v, =(1,2,3), v, =(3,6,9) 
from 2°. These vectors are linearly dependent since 
v, = 3V; 


The proof of the following two theorems on linear dependence is quite simple. 


Theorem 9.1. If the set of vectors {v,, Vor V wt is linearly dependent, then every other finite set 


of vectors containing {v,,V,,---,Vy}is linearly dependent. 


Theorem 9.2. Every set of vectors containing the zero vector is linearly dependent. 
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A setof N(N > 1) vectors that is not linearly dependent is said to be linearly independent. 
Equivalently, a set of N (N 21)vectors {v,, V,,..., Vy } is linearly independent if (9.1) implies 
A =4° =...=4" =0. As a numerical example, consider the two vectors v, = (1,2) and 


v, = (2,1) from 2°. These two vectors are linearly independent because 
Av, + Av, =A(1,2)+ (2,1) =0<S 1242 =0,2A+ w=0 
and 


A+2u=0, 2A+u=00A=0, w=0 


Theorem 9.3. Every non empty subset of a linearly independent set is linearly independent. 


A linearly independent set in a vector space is said to be maximal if it is not a proper subset 
of any other linearly independent set. A vector space that contains a (finite) maximal, linearly 
independent set is then said to be finite dimensional. Of course, if Y is not finite dimensional, 
then it is called infinite dimensional. In this text we shall be concerned only with finite- 
dimensional vector spaces. 


Theorem 9.4. Any two maximal, linearly independent sets of a finite-dimensional vector space 
must contain exactly the same number of vectors. 


Proof. Let {v,,....Vy} and {u,,....,u,,} be two maximal, linearly independent sets of Y. Then we 
must show that N =M. Suppose that N +M, say N <M. By the fact that {v,,...,v,} is 
maximal, the sets [Vise ae ee {VoVo Uy} are all linearly dependent. Hence there exist 


relations 


AV, H+ AnVy + =0 
: (9.2) 
Au, to + Aun Vy + yy = 0 


where the coefficients of each equation are not all equal to zero. In fact, the coefficients 44,..., Ay, 


of the vectors u,,...,u,, are all nonzero, for if 4, =0 for any i, then 


Aai to tAyVy =0 
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for some nonzero /,,,...,;y contradicting the assumption that {v,,..., Vy } is a linearly independent 


set. Hence we can solve (9.2) for the vectors fu, Ul at in terms of the vectors WVose Vv Ht ; 


obtaining 


U, = 441V t+ Vy 
: (9.3) 


Uy = Hui, Tt F LuV 


where iy =-A, / u4 for i =1,...,M; j=1,..,N. 


Now we claim that the first N equations in the above system can be inverted in such a way 
that the vectors IVi v ae are given by linear combinations of the vectors fis.. u K . Indeed, 


inversion is possible if the coefficient matrix [ ea for i, j=1,...,N is nonsingular. But this is 


clearly the case, for if that matrix were singular, there would be nontrivial solutions {q,...., ay} to 


the linear system 


Yau; =6, j=l...,N (9.4) 


contradicting the assumption that set fu, u y \ „being a subset of the linearly independent set 


{u,,...,U, }, is linearly independent. 


Now if the inversion of the first N equations of the system (9.3) gives 


Yi Suu, esn éinUy 
: (9.5) 


Vy = guid, +o. + onnUy 
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where L& | is the inverse of [ u] for i, j =1,..., N ,we can substitute (9.5) into the remaining 


M -N equations in the system (9.3), obtaining 


N 


N 
Uyu = 2 i 
i=l 


j=l 


But these equations contradict the assumption that the set {u,,....,u,,} is linearly independent. 


Hence M >N is impossible and the proof is complete. 
An important corollary of the preceding theorem is the following. 


Theorem 9.5. Let fu,- u xt be a maximal linearly independent set in W, and suppose that 
{v,,...Vy} is given by (9.5). Then {v,,...,v,} is also a maximal, linearly independent set if and 
only if the coefficient matrix ET in (9.5) is nonsingular. In particular, if v, = u, for 
i=1,..,k-1,k+1,...,N but v, #u,, then {u,,...,U,_,,V,,Uj,).--.Uy} is a maximal, linearly 
independent set if and only if the coefficient é, in the expansion of v, in terms of {u,,....u,} in 


(9.5) is nonzero. 


From the preceding theorems we see that the number N of vectors in a maximal, linearly 
independent set in a finite dimensional vector space % is an intrinsic property of Y. We shall call 
this number N the dimension of ¥, written dim Y, namely N = dim ¥, and we shall call any 
maximal, linearly independent set of Y a basis of that space. Theorem 9.5 characterizes all bases 
of ¥ as soon as one basis is specified. A list of examples of bases for vector spaces follows: 


ii The set of N vectors 
(1,0,0,...,0) 
(0,1,0,...,0) 
(0,0,1,...,0) 
(0,0,0,...,1) 
eee eee 
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is linearly independent and constitutes a basis for C ™“, called the standard basis. 


2. If M”? denotes the vector space of all 2x2 matrices with elements from the complex 
numbers ¢, then the four matrices 


oof lo of Lh of [oa 


called the standard basis. 


form a basis for M”? 


3. The elements 1 and i = V—1 form a basis for the vector space @ of complex numbers over 
the field of real numbers. 


The following two theorems concerning bases are fundamental. 


Theorem 9.6. If {e,,e,,...,e,} is a basis for Y, then every vector in Y has the representation 


v=) ġe, (9.6) 
where O one é" \ are elements of ¢ which depend upon the vector ve% . 


The proof of this theorem is contained in the proof of Theorem 9.4. 


Theorem 9.7. The N scalars geen é" \ in (9.6) are unique. 


Proof. As is customary in the proof of a uniqueness theorem, we assume a lack of uniqueness. 
Thus we say that vhas two representations, 


N 


N 
vedere, vedwe, 
j=l 


j=l 


Subtracting the two representations, we have 
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and the linear independence of the basis requires that 


él =w, j=1,2,.., N 


The coefficients 7 Cee \ in the representation (9.6) are the components of v with respect to 


the basis fe, ener € a . The representation of vectors in terms of the elements of their basis is 


illustrated in the following examples. 


1. The vector space ¢° has a standard basis consisting of e,,e,,ande,; 
e,=(L0,0), e,=(0,1,0), e, =(0,0,1) 


A vector v=(2+i,7,8+ 3i) can be written as 
v=(2+i)e,+7e, +(8+3i)e, 


2, The vector space of complex numbers @ over the space of real numbers # has the basis 
{1,i}. Any complex number z can then be represented by 


Z=purtAi 
where ,A€ 2. 


oh The vector space .@** of all 2x2 matrices with elements from the complex numbers ¢ 
has the standard basis 


1 0 01 0 0 00 
“ulo of 27 ]o of 7l of *2 lo 1 


y= 


where u,A,v,č €@, can then be represented by 


V = Ue, +2e, +ve,, + ée,, 
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The basis for a vector space is not unique and the general rule of change of basis is given by 
Theorem 9.5. An important special case of that rule is made explicit in the following exchange 
theorem. 


Theorem 9.8. If {e,,e,,...,e,}is a basis for Y and if #={b,,b,,...,b,,} is a linearly independent 
set of K(N > K ) in %, then it is possible to exchange a certain K of the original base vectors with 


b,,b,,...,b, so that the new set is a basis for ¥. 


Proof. We select b, from the set Z and order the basis vectors such that the component €’ of b, is 
not zero in the representation 


b=) 4'e; 
j=l 
By Theorem 9.5, the vectors {b,,e,,...,¢,} form a basis for ¥. A second vector b, is selected 


from Y and we again order the basis vectors so that this time the component A’ is not zero in the 
formula 


N 
b, = A'b, + S Ale, 
j=2 
Again, by Theorem 9.5 the vectors {b,,b,, Cece nt form a basis for% . The proof is completed by 


simply repeating the above construction K —2 times. 


We now know that when a basis for Y is given, every vector in Y has a representation in 
the form (9.6). Inverting this condition somewhat, we now want to consider a set of vectors 
Z = {b,,b,,...,b,,} of Y with the property that every vector ve% can be written as 


M + 
v= 2A) 
E 


Such a. set is called a generating set of Y (or is said to span Y ). In some sense a generating set is 
a counterpart of a linearly independent set. The following theorem is the counter part of Theorem 
9.1. 
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Theorem 9.9. If Z = {b,, wD a is a generating set of Y , then every other finite set of vectors 
containing # is also a generating set of Y . 


In view of this theorem, we see that the counterpart of a maximal, linearly independent set 
is a minimal generating set, which is defined by the condition that a generating set {b,,..., b m? is 


minimal if it contains no proper subset that is still a generating set. The following theorem shows 
the relation between a maximal, linearly independent set and a minimal generating set. 


Theorem 9.10. Let Z = {b,,...,b,,} be a finite subset of a finite dimensional vector space ¥ . 


Then the following conditions are equivalent: 


(i) Zis a maximal linearly independent set. 
(ii) Z is a linearly independent generating set. 
(iii) Z is a minimal generating set. 


Proof. We shall show that (i) > (ii) > (iii) > (i). 


(i)= (ii). This implication is a direct consequence of the representation (9.6). 


(ii) > (iii). This implication is obvious. For if # is a linearly independent generating set but 
not a minimal generating set, then we can remove at least one vector, say b,,, and the remaining 


set is still a generating set. But this is impossible because b,,, can then be expressed as a linear 
combination of {b,,...,b,,_,}, contradicting the linear independence of Z. 


(iii) > (i). If Z is a minimal generating set, then Z must be linearly independent because 


otherwise one of the vectors of @, say b,,, can be written as a linear combination of the vectors 


{b,,...,b,,_,}. It follows then that {b,,...,b,,_,} is still a generating set, contradicting the 
assumption that fb,- b i is minimal. Now a linearly independent generating set must be 
maximal, for otherwise there exists a vector b e ⁄ such that {b,, Dd mwb} is linearly independent. 
Then b cannot be expressed as a linear combination of fbi.. by, A thus contradicting the 


assumption that # is a generating set. 


In view of this theorem, a basis # can be defined by any one of the three equivalent 
conditions (i), (ii), or (iii). 
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In elementary plane geometry it is shown that two straight lines determine a plane if the 
straight lines satisfy a certain condition. What is the condition? Express the condition in 


vector notation. 
Prove Theorems 9.1-9.3, and 9.9. 


Let M°? denote the vector space of all 3x3 matrices with elements from the complex 


numbers ¢. Determine a basis for M” 


3 


Let M”? denote the vector space of all 2x 2 matrices with elements from the real numbers 


Z. Is either of the following sets a basis for 4 


2x2 


? 


(ool [o 2} | 


0 1 
3 0 


| 


eal 


1 0] [fo 1 
lo o? [0 0 


0 0 


| fe 


| 


oo] 


Are the complex numbers 2+ 4i and 6+ 2i linearly independent with respect to the field of 
real numbers #? Are they linearly independent with respect to the field of complex 


numbers? 


Sec. 10 e Intersection, Sum, Direct Sum 55 


Section 10. Intersection, Sum and Direct Sum of Subspaces 


In this section operations such as “summing” and “intersecting” vector spaces are 
discussed. We introduce first the important concept of a subspace of a vector space in analogy with 
a subgroup of a group. A non empty subset % of a vector space Y is a subspace if: 


(a) uwe% >u+weM® forall u,w € &. 
b) uce Y >Aue® forall Aeg. 


Conditions (a) and (b) in this definition can be replaced by the equivalent condition: 


(a') u,w € Y > u+ uw e % forall de®. 


Examples of subspaces of vector spaces are given in the following list: 


1. The subset of the vector space ¢” of all N-tuples of the form (0, A, A,,...,4y ) is a subspace of 
gr 

2. Any vector space ¥ is a subspace of itself. 

3. The set consisting of the zero vector {0} is a subspace of ¥ . 


4. The set of real numbers # can be viewed as a subspace of the real space of complex numbers 
g. 


The vector spaces {0} and ¥ itself are considered to be trivial subspaces of the vector space ¥ . If 
% is nota trivial subspace, it is said to be a proper subspace of ¥ . 


Several properties of subspaces that one would naturally expect to be true are developed in 
the following theorems. 


Theorem 10.1. If % is a subspace of ¥, then 0c Y 


Proof. The proof of this theorem follows easily from (b) in the definition of a subspace above by 
setting 2 =0. 


Theorem 10.2. If % is a subspace of Y , then dim Y < dim ¥. 


Proof. By Theorem 9.8 we know that any basis of % can be enlarged to a basis of Y; it follows 
that dim X < dim ¥. 
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Theorem 10.3. If % is a subspace of ¥, then dim Y = dim Y if and only if Y=¥Y. 


Proof. If %=%, then clearly dim X = dim %. Conversely, if dim % = dim %, then a basis for % 
is also a basis for ⁄. Thus, any vector v € ¥ is also in %, and this implies Y= %. 


Operations that combine vector spaces to form other vector spaces are simple extensions of 
elementary operations defined on sets. If % and W are subspaces of a vector space% , the sum of 
yand W, written Y+W, is the set 


U+W=\u+we%wew} 


Similarly, if % and W are subspaces of a vector space ¥, the intersection of Y and W, denoted 
by YNY, is the set 


UW =fujue wanduc W\ 
The union of two subspaces Wand Wof ¥, denoted by Y UW, is the set 
UW =\ulue Yoruc VW} 
Some properties of these three operations are stated in the following theorems. 
Theorem 10.4. If % and Ware subspaces of Y, then Y+W is asubspace of ¥. 
Theorem 10.5. If % and W are subspaces of ¥, then Y and W are also subspaces of Y + W. 


Theorem 10.6. If % and W are subspaces of ¥Y, then the intersection ZW is a subspace of ¥, 


Proof. LetusweYw. Thenu,we Wand u,we®&. Since Wand Ww are subspaces, 
u+weWand u+weY which means that u+we YOY also. Similarly, if ue YW ,then for 
al Aeg. MEUNYW. 
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Theorem 10.7. If % and W are subspaces of ¥, then the union YU W is not generally a subspace 
of ¥. 


Proof. Let ue Y,u¢ W,and let we W and w g Y; then u+ w¢ Wand u+w g YW, which means 
that u+wg Vow. 


Theorem 10.8. Let v be a vector in V+4#, where % and W are subspaces of ¥. The 
decomposition of ve Y+W into the form v =u + w, where uc Vand w e %, is unique if and 
only if YW ={0}. 


Proof. Suppose there are two ways of decomposing v ; for example, let there be a uand w’ in Y 
anda wand w'in W such that 


V=u-+w and v=u'+w' 


The decomposition of vis unique if it can be shown that the vector b, 


b=u-u'=w'-w 


vanishes. The vector b is contained in YAW since u and u'are known to be in %, and w and 
w' are known to be in W. Therefore Y AW = {0} implies uniqueness. Conversely, if we have 
uniqueness, then YW = {0} , for otherwise any nonzero vector y € Y% ^% has at least two 


decompositions y=y+0=0+y. 


The sum of two subspaces % and W is called the direct sum of % and W and denoted by 
UDH if UMAW ={0}. This definition is motivated by the result proved in Theorem 10.8. If 
UDBW=Y%, then © is called the direct complement of W in Y. The operation of direct 
summing can be extended to a finite number of subspaces %,¥%;,...,% of ¥. The direct sum 


%,B%,@---PB¥%,_ is required to satisfy the conditions 
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The concept of a direct sum of subspaces is an important tool for the study of certain concepts in 
geometry, as we shall see in later chapters. The following theorem shows that the dimension of a 
direct sum of subspaces is equal to the sum of the dimensions of the subspaces. 


Theorem 10.9. If % and W are subspaces of ¥, then 


dim Y Ow = dim X + dim W 


Proof. Let {u,,U,,...,U,} be a basis for Y and {W Wass Wo} be a basis for W.. Then the set of 
vectors {Uy ,U,,...5Ugs Wy) Wao Wo \ is linearly independent since Y A W = {0}. This set of 


vectors generates Y% ® W since for any ve Y ® W we can write 


Q 
veutwe S in e 


j=l j=l 


where ue Y and we W. Therefore by Theorem 9.10, dim¥’Ow=R+Q. 


The result of this theorem can easily be generalized to 


dim(% ® %O---® %)= Sim, 


I 
j=l 


The designation “direct sum” is sometimes used in a slightly different context. If WY and Y 
are vector spaces, not necessarily subspaces of a common vector space, the set Y x % can be given 
the vector space structure by defining addition and scalar multiplication as in Exercise 8.5. The set 
Y xU with this vector space structure is also called the direct sum and is written ⁄ ® Y. This 
concept of direct sum is slightly more general since Y and % need not initially be subspaces of a 
third vector space. However, after we have defined Y® ¥, then % and Y can be viewed as 
subspaces of Y® ¥ ; further, in that sense, Y% ® Y is the direct sum of Y and Y in accordance 
with the original definition of the direct sum of subspaces. 


Exercises 


10.1 Under what conditions can a single vector constitute a subspace? 

10.2 Prove Theorems 10.4 and 10.5. 

10.3. If Yand W are subspaces of Y , show that any subspace which contains YU W also 
contains Y + %4. 
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10.4 If Wand Ware vector spaces and if Y ® % is their direct sum in the sense of the second 
definition given in this section, reprove Theorem 10.9. 

10-5 Let Y - 2°, and suppose that % is the subspace spanned by {(0,1, 1)} and W is the 
subspace spanned by {( 0,1), (1, 1,1)}. Show that ⁄ =% ® W. Find another subspace 
Y such that V=% ® W. 
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Section 11. Factor Space 


In Section 2 the concept of an equivalence relation on a set was introduced and in Exercise 
2.2 an equivalence relation was used to partition a set into equivalence sets. In this section an 
equivalence relation is introduced and the vector space is partitioned into equivalence sets. The 
class of all equivalence sets is itself a vector space called the factor space. 


If % is a subspace of a vector space %, then two vectors w and v in ¥ are said to be 
equivalent with respect to Y , written v ~ w, if w-v is a vector contained in & . It is easy to see 
that this relation is an equivalence relation and it induces a partition of Y into equivalence sets of 
vectors. If ve% , then the equivalence set of v, denoted by V,! is the set of all vectors of the form 
v+u, where u is any vector of %, 


v={v+uluc Z} 


To illustrate the equivalence relation and its decomposition of a vector space into 
equivalence sets, we will consider the real vector space 2°, which we can represent by the 


Euclidean plane. Let u be a fixed vector in 2°, and define the subspace Y of & by {Au |a ER}. 


This subspace consists of all vectors of the form Au, 4 € # , which are all parallel to the same 
straight line. This subspace is illustrated in Figure 2. From the definition of equivalence, the 
vector v is seen to be equivalent to the vector w if the vector v—w is parallel to the line 
representing %. Therefore all vectors that differ from the vector v by a vector that is parallel to the 
line representing % are equivalent. The set of all vectors equivalent to the vector v is therefore the 
set of all vectors that terminate on the dashed line parallel to the line representing % in Figure 2. 
The equivalence set of v, V, is the set of all such vectors. The following theorem is a special case 
of Exercise 2.2 and shows that an equivalence relation decomposes ¥ into disjoint sets, that is to 
say, each vector is contained in one and only one equivalence set. 


1 Only in this section do we use a bar over a letter to denote an equivalence set. In other sections, unless otherwise 
specified, a bar denotes the complex conjugate. 
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Figure 2 


Theorem 11.1. If v+ w, then VAW=@. 


Proof. Assume that V + w but that there exists a vector x in vaw. Then xeV, x~ v, and 
xew, x~w. By the transitive property of the equivalence relation we have v ~ w, which 
implies V = w and which is a contradiction. Therefore V™w contains no vector unless V = W, in 
which case VAW=V. 


We shall now develop the structure of the factor space. The factor class of % , denoted by 
Y IU is the class of all equivalence sets in Y formed by using a subspace % of ¥ . The factor 
class is sometimes called a quotient class. Addition and scalar multiplication of equivalence sets 
are denoted by 


and 


Av =Av 


respectively. It is easy to verify that the addition and multiplication operations defined above 
depend only on the equivalence sets and not on any particular vectors used in representing the sets. 
The following theorem is easy to prove. 
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Theorem 11.2. The factor set Y/Y forms a vector space, called a factor space, with respect to 
the operations of addition and scalar multiplication of equivalence classes defined above. 


The factor space is also called a quotient space. The subspace Y in Y/Y plays the role of 
the zero vector of the factor space. In the trivial case when Y = ¥ , there is only one equivalence 


set and it plays the role of the zero vector. On the other extreme when % = {0} „then each 


equivalence set is a single vector and Y/{0}=7 . 


Exercises 


11.1 Show that the relation between two vectors v and w€% that makes them equivalent is, in 
fact, an equivalence relation in the sense of Section 2. 

11.2 Give a geometrical interpretation of the process of addition and scalar multiplication of 
equivalence sets in #* in Figure 2. 

11.3 Show that W is equal to the union of all the equivalence sets in W. Thus ⁄ /% is a class 
of nonempty, disjoint sets whose union is the entire space ¥ . 

11.4 Prove Theorem 11.2. 

11.5 Show that dim(¥/%) =dim¥ -dim X. 
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Section 12. Inner Product Spaces 


There is no concept of length or magnitude in the definition of a vector space we have been 
employing. The reason for the delay in the introduction of this concept is that it is not needed in 
many of the results of interest. To emphasize this lack of dependence on the concept of magnitude, 
we have delayed its introduction to this point. Our intended applications of the theory, however, do 
employ it extensively. 


We define the concept of length through the concept of an inner product. An inner product 
on a complex vector space Y is a function f:¥x¥W —> € with the following properties: 


(1) f(uv)=f(v,u); 
(2) Af (u,v) = f (2u, v); 


(3) f(u+w,v)= f(u, v)+ f (w,v); 
(4) f(u,u)20 and f(u u)=0&u=0; 


forall u,v,we¥V and A€¢@. In Property 1 the bar denotes the complex conjugate. Properties 2 
and 3 require that f be linear in its first argument; i.e., f (2u + uv,w)=4f(u,w)+ uf (v,w) 
forall u,v,weY andall A, 4¢¢. Property 1 and the linearity implied by Properties 2 and 3 
insure that f is conjugate linear in its second argument; i.e., 

f (u,Av + uw)=4 f (u,v)+ zf (uw) forall uv,weY andall 4,4€% . Since Property 1 
ensures that f (u, u) is real, Property 4 is meaningful, and it requires that f be positive definite. 


There are many notations for the inner product. We shall employ the notation of the “dot product” 
and write 


f (u v)=u-v 


An inner product space is simply a vector space with an inner product. To emphasize the 
importance of this idea and to focus simultaneously all its details, we restate the definition as 
follows. 


Definition. A complex inner product space, or simply an inner product space, is a set %⁄ anda 
field ¢ such that: 


(a) There exists a binary operation in ¥ called addition and denoted by + such that: 
(1) (u+v)+w=u+(v+w) forall uv,wer. 
(2) u+v=v+u foralluyveY. 
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(3) There exists an element 0 € Y such that u+0=u forall ue% . 
(4) For every ue% there exists an element -u €% such that u+(—u)=0. 


(b) There exists an operation called scalar multiplication in which every scalar 2 € € 
can be combined with every element ue Y to give an element Au e ⁄ such that: 


(1) A (pu) =(Au)u; 
(2) (A+u)u=Au+ m; 
8) A(u+v)=Au+ Av; 
(4) lu=u; 
forall A,u4,e¢ andallu,veV; 


(c) There exists an operation called inner product by which any ordered pair of vectors 
u and v in Y determines an element of ¢ denoted by u-v such that 
(1) u-v=v-u; 
(2) Au-v =(Au)-v; 
(3) (u+w)-v=u-v+w-v; 
(4) u-u2>0 andu-u=0cu=0; 
foralluyvy,.wevY and e7. 


A real inner product space is defined similarly. The vector space ¢” becomes an inner product 
space if, for any two vectors u, v €@". where u=(A,,4,...,2)) and v=(/4,/b,....2y), we define 
the inner product of u and v by 


N 
uv= DAL; 


j=l 


The length of a vector is an operation, denoted by || 


, that assigns to each nonzero vector ve% a 
positive real number by the following rule: 


|v] =Vv-v (12.1) 


Of course the length of the zero vector is zero. The definition represented by (12.1) is for an inner 
product space of N dimensions in general and therefore generalizes the concept of “length” or 
“magnitude” from elementary Euclidean plane geometry to N-dimensional spaces. Before 
continuing this process of algebraically generalizing geometric notions, it is necessary to pause and 
prove two inequalities that will aid in the generalization process. 


Theorem 12.1. The Schwarz inequality, 
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lu-v| sully 


is valid for any two vectors u,v in an inner product space. 


Proof. The Schwarz inequality is easily seen to be trivially true when either u or v is the 


0 vector, so we shall assume that neither u nor v is zero. Construct the vector 


(12.2) 


(u-u)v—(v-u)uand employ Property (c4), which requires that every vector have a nonnegative 


length, hence 

(lf [vf -(a-v)(a-v)) jl? > 0 
Since u must not be zero, it follows that 

Jul’ |v = (a-v) (uv) = u-v 
and the positive square root of this equation is Schwarz’s inequality. 
Theorem 12.2. The triangle inequality 

fu +v]|< ful |v] 

is valid for any two vectors u,v in an inner product space. 
Proof: The squared length of u +v can be written in the form 


Jus vl =(u+v)-(u+v)=ful’ +v +u-v+v-u 


= [ull + |v +2Re(u-v) 
where Re signifies the real part. By use of the Schwarz inequality this can be rewritten as 
2 
Jat vf < fal + [yf +2 hy = (lel + fv) 


Taking the positive square root of this equation, we obtain the triangular inequality. 


(12.3) 


For a real inner product space the concept of angle is defined as follows. The angle between two 


vectors u and v, denoted by @, is defined by 
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(12.4) 


This definition of a real-valued angle is meaningful because the Schwarz inequality in this case 
shows that the quantity on the right-hand side of (12.4) must have a value lying between 1 and —1, 
i.e., 
EN ey 
luv 


Returning to complex inner product spaces in general, we say that two vectors u and v are 
orthogonal if u-v or v-u is zero. Clearly, this definition is consistent with the real case, since 
orthogonality then means |6|= 7/2. 


The inner product space is a very substantial algebraic structure and parts of the structure 
can be given slightly different interpretations. In particular it can be shown that the length ||v||of a 
vector v€% isa particular example of a mathematical concept known as a norm, and thus an 
inner product space is anormal space. A norm on ¥ is a real-valued function defined on W 
whose value is denoted by ||v|| and which satisfies the following axioms: 


(1) — |v|>0 and |v|=0&v=0; 
© lkl; 
© |ul+ [v2 u+ vfs 


forall uve% andall 2e¢. In defining the norm of v, we have employed the same notation as 
that for the length of v because we will show that the length defined by an inner product is a norm, 
but the converse need not be true. 


Theorem 12.3. The operation of determining the length ||v|| of v €% is a norm on Y. 


Proof. The proof follows easily from the definition of length. Properties (c1), (c2), and (c4) of an 
inner product imply Axioms | and 2 of a norm, and the triangle inequality is proved by Theorem 
12.2. 


We will now show that the inner product space is also a metric space. A metric space is a 
nonempty set M equipped with a positive real-valued function æ x M > # ,called the distance 
function that satisfies the following axioms: 
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forall u,v,w € 4. In other words, a metric space is a set Æ of objects and a positive-definite, 
symmetric distance function d satisfying the triangle inequality. The boldface notation for the 
elements of æ is simply to save the introduction of yet another notation. 


Theorem 12.4. An inner product space Y is a metric space with the distance function given by 
d (u,v) = |u- v| (12.5) 
Proof. Let w=u-v; then from the requirement that ||w| > 0 and ||w||= 0 < w =0 it follows that 
|ļu-v||20 and |u-v|=0&u=v 


Similarly, from the requirement that ||w|| = ||-w]|, it follows that 


lu-v||=|v-u] 
Finally, let u be replaced by u-v and vby v-w in the triangle inequality (12.3); then 
lu-wlļ|< u- v|+|v-wl 


which is the third and last requirement for a distance function in a metric space. 


The inner product as well as the expressions (12.1) for the length of a vector can be 
expressed in terms of any basis and the components of the vectors relative to that basis. Let 


{e,,€,,...,€, } be a basis of Y and denote the inner product of any two base vectors by e, 


ep =e ey (12.6) 


Thus, if the vectors uand v have the representations 
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u=),%e, v=) u'e, (12.7) 


N ; N N N , 
u-v=) Me; due, =), 2 A Hey (12.8) 


Equation (12.8) is the component expression for the inner product. 


From the definition (12.1) for the length of a vector v, and from (12.6) and (12.7)2, we can 
write 


j=1 k=1 


MEZ] 025) 


This equation gives the component expression for the length of a vector. For a real inner product 
space, it easily follows that 


cos @ = et (12.10) 


N N VA ae ee 1/2 
Sheu) (Shean 


p=1 r=1 


In formulas (12.8)-(12.10) notice that we have changed the particular indices that indicate 
summation so that no more than two indices occur in any summand. The reason for changing these 
dummy indices of summation can be made apparent by failing to change them. For example, in 
order to write (12.9) in its present form, we can write the component expressions for v in two 
equivalent ways, 


N N 
D j = 5 k 
v= Hej, v= Me, 
jel k=1 


and then take the dot product. If we had used the same indices to represent summation in both 
cases, then that index would occur four times in (12.9) and all the cross terms in the inner product 
would be left out. 
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Exercises 
12.1 Derive the formula 
2u-v=|u+v| +i iv -ai a -a 
which expresses the inner product of two vectors in terms of the norm. This formula is 


known as the polar identity. 


12.2 Show that the norm || || induced by an inner product according to the definition (12.1) must 
satisfy the following parallelogram law; 


Jas vf +a -vit = 2h vf) + 2 hu 


for all vectors uand v. Prove by counterexample that a norm in general need not be an 
induced norm of any inner product. 


12.3 Use the definition of angle given in this section and the properties of the inner product to 
derive the law of cosines. 
12.4 If W and Ware inner produce spaces, show that the equation 


f ((v,w),(u,b))=v-u+w-b 


where v,ue¥Y and w,b € %, defines an inner product on V ® &. 


12.5 Show that the only vector in ¥ that is orthogonal to every other vector in Y is the zero 
vector. 

12.6 Show that the Schwarz and triangle inequalities become equalities if and only if the vectors 
concerned are linearly dependent. 


12.7 Show that |u, —u,]|>|lu,|/—|u,| for all u,,u, in an inner product space Y. 
12.8 Prove by direct calculation that pee: jk wT in (12.9) is real. 


12.9 If % is a subspace of an inner product space ¥, prove that % is also an inner product 
space. 


12.10 Prove that the N x N matrix le A defined by (12.6) is nonsingular. 
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Section 13. Orthonormal Bases and Orthogonal Complements 


Experience with analytic geometry tells us that it is generally much easier to use bases 
consisting of vectors which are orthogonal and of unit length rather than arbitrarily selected 
vectors. Vectors with a magnitude of 1 are called unit vectors or normalized vectors. A set of 
vectors in an inner product space ¥ is said to be an orthogonal set if all the vectors in the set are 
mutually orthogonal, and it is said to be an orthonormal set if the set is orthogonal and if all the 


vectors are unit vectors. In equation form, an orthonormal set fins i P satisfies the conditions 


i i, =ô, = 


r E P (13.1) 


0 if j#k 


The symbol ô, introduced in the equation above is called the Kronecker delta. 


Theorem 13.1. An orthonormal set is linearly independent. 


Proof. Assume that the orthonormal set fis i, P is linearly dependent, that is to say there 


exists a set of scalars fa, Aas An h not all zero, such that 


The inner product of this sum with the unit vector i, gives the expression 


M 
SASi =A t +A S = Ak =0 
jel 
for k =1,2,...,.M. A contradiction has therefore been achieved and the theorem is proved. 


As a corollary to this theorem, it is easy to see that an orthonormal set in an inner product 
space ¥Y can have no more than N =dimY elements. An orthonormal set is said to be complete in 
an inner product space ¥ if it is not a proper subset of another orthonormal set in the same space. 
Therefore every orthonormal set with N = dim% elements is complete and hence maximal in the 
sense of a linearly independent set. It is possible to prove the converse: Every complete 
orthonormal set in an inner product space %⁄ has N =dimY elements. The same result is put in a 
slightly different fashion in the following theorem. 
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Theorem 13.2. A complete orthonormal set is a basis for% ; such a basis is called an orthonormal 
basis. 


Any set of linearly independent vectors can be used to construct an orthonormal set, and 
likewise, any basis can be used to construct an orthonormal basis. The process by which this is 
done is called the Gram-Schmidt orthogonalization process and it is developed in the proof of the 
following theorem. 


Theorem 13.3. Given a basis {e,,e,,...,€, | of an inner product space ¥ , then there exists an 
orthonormal basis {i,,...,i, } such that {e,...,e,} and {i,,...,i,} generate the same subspace % of 
y ,for each k =1,...,N. 


Proof. The construction proceeds in two steps; first a set of orthogonal vectors is constructed, then 
this set is normalized. Let {d,, d,,.. dy \ denote a set of orthogonal, but not unit, vectors. This set 


is constructed from {e,,e,,...,€y} as follows: Let d, =e, and put 


d, =e, + éd, 


The scalar ¢ will be selected so that d, is orthogonal to d,; orthogonality of d, and d, requires 
that their inner product be zero; hence 


d, -d, =0=e,-d,+<d,-d, 


implies 


where d,-d, #Osince d, #0. The vector d, is not zero, because e, and e, are linearly 


independent. The vector d, is defined by 


d,=e,+é°d,+é'd, 


The scalars &* and &' are determined by the requirement that d, be orthogonal to both d,and d,; 
thus 
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d,-d, =e, -d, +é'd, -d, =0 
d, -d, =d, -d, + éd, -d, =0 


and, as a result, 


The linear independence of e,,e,,and e, requires that d, be nonzero. It is easy to see that this 
scheme can be repeated until a set of N orthogonal vectors {d,, d,,...,d ne has been obtained. The 
orthonormal set is then obtained by defining 


i, =d,/|d,|, k =1,2,...,N 


It is easy to see that {e,...,e,, {d,,...,d,},and, hence, {i,,...,i,} generate the same subspace for 
each k. 


The concept of mutually orthogonal vectors can be generalized to mutually orthogonal 
subspaces. In particular, if % is a subspace of an inner product space, then the orthogonal 


complement of % is a subset of Y , denoted by %*, such that 

u+ ={v|v-u=0 forall u € X} (13.2) 
The properties of orthogonal complements are developed in the following theorem. 
Theorem 13.4. If % is a subspace of Y , then (a) Y is a subspace of Y and (b) Y = Y ® %+. 
Proof. (a) If u, and u, are in %*, then 


u,-v=u,-v=0 


for all ve Y. Therefore for any 4,4, € Ẹ, 
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(Au,+4u,)-v=Au,-v+Au,-v=0 


Thus Au, +4,u, € %5. 


(b) Consider the vector space K++. Let veU nY . Then by (13.2), v:-v=0, which 
implies v=0. Thus ¥+% =% ® Y. To establish that ¥ =% ® &", let {i,,...,i,}be an 


orthonormal basis for %. Then consider the decomposition 


The term in the brackets is orthogonal to each i, and it thus belongs to u>. Also, the second term 


isin V Therefore ¥ =Y + UY“ =U @® WY, and, the proof is complete. 


As a result of the last theorem, any vector v e% can be written uniquely in the form 


v=u+w (13.3) 
where ue Y and we %-. If vis a vector with the decomposition indicated in (13.3), then 


v =lu+ wh = ul +w +a. w+ wu = full + w 
It follows from this equation that 


2 


Ivf lel. Ivf = Ile 


The equation is the well-known Pythagorean theorem, and the inequalities are special cases of an 
inequality known as Bessel’s inequality. 


Exercises 


13.1 Show that with respect to an orthonormal basis the formulas (12.8)-(12.10) can be written 
as 


u-v=)A/q! (13.4) 
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and, for a real vector space, 


13.2 
13.3 


13.4 


13.5 


13.6 


13.7 


and 
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cos 6 = re 172 N o; P? 
piu") ( nu'n) 


Prove Theorem 13.2. 
If % is a subspace of the inner product space Y , show that 


If Y is an inner product space, show that 


y+ ={0} 


Compare this with Exercise 12.5. 


Given the basis 


e =(1,1,1), e,=(0,11), e, =(0,0,1) 


(13.5) 


(13.6) 


for Z°, construct an orthonormal basis according to the Gram-Schmidt orthogonalization 


process. 


If % is a subspace of an inner product space ¥ , show that 
dim %+ = dim% - dim Y 


Given two subspaces % and ¥, of Y , show that 
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13.8 In the proof of Theorem 13.4 we used the fact that if a vector is orthogonal to each vector 
i, of a basis {i,,...,i,}0f & , then that vector belongs to %*. Give a proof of this fact. 
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Section 14. Reciprocal Basis and Change of Basis 


In this section we define a special basis, called the reciprocal basis, associated with each 
basis of the inner product space Y . Components of vectors relative to both the basis and the 
reciprocal basis are defined and formulas for the change of basis are developed. 


The set of N vectors fe, e°.. e” \ is said to be the reciprocal basis relative to the basis 


{e,,e,,...,@y} of an inner product space ¥ if 


eveso  k,s=1,2,..., N (14.1) 


where the symbol 6“ is the Kronecker delta defined by 


x IL k=s 
6, = (14.2) 
0, k#s 


Thus each vector of the reciprocal basis is orthogonal to N —1 vectors of the basis and when its 
inner product is taken with the Nth vector of the basis, the inner product has the value one. The 
following two theorems show that the reciprocal basis just defined exists uniquely and is actually a 
basis for V . 


Theorem 14.1. The reciprocal basis relative to a given basis exists and is unique. 


Proof. Existence. We prove only existence of the vector e'; existence of the remaining vectors 
can be proved similarly. Let % be the subspace generated by the vectors e,,...,e,, and suppose 


that Z is the orthogonal complement of %. Then dim X = N —1, and from Theorem 10.9, 
dim ZY =1. Hence we can choose a nonzero vector we Y. Since e,¢ Y, e, and ware not 
orthogonal, 


e,-wz#0 


we can simply define 
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Then e' obeys (14.1) for k =1. 


Uniqueness. Assume that there are two reciprocal bases, fe, e°.. e” \ and fa, d’,...,d” \ relative 


to the basis {e,,e,,...,e,}of Y. Then from (14.1) 
e*-e, = ô* and d‘-e, = ô for k,s =1,2,...,N 
Subtracting these two equations, we obtain 
(e‘-d‘)-e,=0,  k,s=1,2,...,N (14.3) 


Thus the vector e‘—d* must be orthogonal to {e,,e,,...,e, } Since the basis generates Y , (14.3) is 


equivalent to 
(e‘-d")-v=0 forall vey (14.4) 


In particular, we can choose v in (14.4) to be equal to e‘—d“, and it follows then from the 
definition of an inner product space that 


e*=d*,  k=12,...,N (14.5) 
Therefore the reciprocal basis relative to fe, TEE Bay } is unique. 
The logic in passing from (14.4) to (14.5) has appeared before; cf. Exercise 13.8. 


Theorem 14.2. The reciprocal basis fe, e’, e") with respect to the basis {e,,e,,...,¢, } of the 


inner product space % is itself a basis forY . 


Proof. Consider the linear relation 


N 
Sae =0 
q=1 
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If we compute the inner product of this equation withe,, k =1, 2,..., N , then 


N N 
Z A,e7-e, = D AD =4, =0 
q=l q=1 


Thus the reciprocal basis is linearly independent. But since the reciprocal basis contains the same 
number of vectors as that of a basis, it is itself a basis for ⁄ . 


Since e*, k =1, 2,...,N, isin YW , we can always write 


N 
e =) ee, (14.6) 
q=1 
where, by (14.1) and (12.6), 
k k z ke 
e.e, =d; =) ee, (14.7) 
q=l 
and 
k oj x kq cj kj _ „jk 
eei =) e"s] =e" =e! (14.8) 
q=1 


From a matrix viewpoint, (14.7) shows that the N° quantities e“’ (k, q=1, 2,..., N) are the 
elements of the inverse of the matrix whose elements are e. In particular, the matrices [e'"| and 


[es] are nonsingular. This remark is a proof of Theorem 14.2 also by using Theorem 9.5. Itis 
possible to establish from (14.6) and (14.7) that 


ee“ (14.9) 


S 
1 


N 
e = 
k= 

To illustrate the construction of a reciprocal basis by an algebraic method, consider the real 
vector space 2°, which we shall represent by the Euclidean plane. Let a basis be given for 2 
which consists of two vectors 45° degrees apart, the first one, e,, two units long and the second 


one, e,, one unit long. These two vectors are illustrated in Figure 3. 
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To construct the reciprocal basis, we first note that from the given information and equation 


(12.6) we can write 


ey ep Sey = V2, C5, =1 
Writing equations (14.7), out explicitly for the case N = 2, we have 
ee, +e e,, =1, ee, +e”e,, =0 
e''e,, +ee,, =0, evle,, tee, =1 


Substituting (14.10) into (14.11), we find that 


e! 


Figure 3. A basis and reciprocal basis for #* 


(14.10) 


(14.11) 


(14.12) 


When the results (14.2) are put into the special case of (14.6) for N = 2, we obtain the explicit 


expressions for the reciprocal basis fe, e}, 


The reciprocal basis is illustrated in Figure 3 also. 


(14.13) 
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Henceforth we shall use the same kernel letter to denote the components of a vector relative 
to a basis as we use for the vector itself. Thus, a vector v has components v*, k =1,2,..., N 
relative to the basis fe, eee € cj and components v,, k =1, 2,...,N , relative to its reciprocal basis, 


N N 
v=)ve,, v=} ve (14.14) 
k=1 k=1 


The components v’,v’,...,v" with respect to the basis {e,,e,,...,e,} are often called the 
contravariant components of v, while the components v,,V,,...,V, With respect to the reciprocal 
basis fe, e°.. e” \ are called covariant components. The names covariant and contravariant are 


somewhat arbitrary since the basis and reciprocal basis are both bases and we have no particular 
procedure to choose one over the other. The following theorem illustrates further the same remark. 


Theorem 14. 3. If fe!,.. e" bis the reciprocal basis of {e,,... ey} then {e,,...,e,} is also the 


reciprocal basis of fe, e" he 


For this reason we simply say that the bases {e,,...,e,} and ‘Coe e” \ are (mutually) 


reciprocal. The contravariant and covariant components of vare related to one another by the 
formulas 


N 
vi =v e= die", 
q=1 
(14.15) 
N 
v =V-e, = Yer 
q=1 


where equations (12.6) and (14.8) have been employed. More generally, if u has contravariant 
components u' and covariant components u, relative to {e,,...,e, }and {Oh sos e” n respectively, 


then the inner product of u and v can be computed by the formulas 


N N NN | NN fe: 
uve Sup = Suv = Yeu, =P Seale (4.16 


which generalize the formulas (13.4) and (12.8). 
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As an example of the covariant and contravariant components of a vector, consider a vector 
v in 2° which has the representation 


v=Že, +2e, 


relative to the basis {e,, e,} illustrated in Figure 3. The contravariant components of v are then 


(3/ 2, 2); to compute the covariant components, the formula (14.16), is written out for the case 
N =2, 
v =e +e v, V, =eV + ev 


Then from (14.10) and the values of the contravariant components the covariant components are 
given by 


v,=6+2V2,  v,=(3/42)+2 


hence 
v=(6+2v2 Je! +(3/V2 +2)e” 


The contravariant components of the vector v are illustrated in Figure 4. To develop a geometric 
feeling for covariant and contravariant components, it is helpful for one to perform a vector 
decomposition of this type for oneself. 


e! 


Figure 4. The covariant and contravariant components of a vector in 2°. 


A substantial part of the usefulness of orthonormal bases is that each orthonormal basis is 
self-reciprocal; hence contravariant and covariant components of vectors coincide. The self- 
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reciprocity of orthonormal bases follows by comparing the condition (13.1) for an orthonormal 
basis with the condition (14.1) for a reciprocal basis. In orthonormal systems indices are written 
only as subscripts because there is no need to distinguish between covariant and contravariant 
components. 


Formulas for transferring from one basis to another basis in an inner product space can be 
developed for both base vectors and components. In these formulas one basis is the set of linearly 


independent vectors fe, highs }, while the second basis is the set {é, no Êy ie From the fact that 
both bases generate Y , we can write 


N 
é,=> Tee, k=1,2,...,N (14.17) 
s=1 
and 
N i 
e,=> Tle, q=1,2,...,N (14.18) 
k=1 


where T and T; are both sets of N° scalars that are related to one another. From Theorem 9.5 the 


NxN matrices E | and E l must be nonsingular. 


Substituting (14.17) into (14.18) and replacing e, by y ô e, , we obtain 


which can be rewritten as 


N N ag 
$| TT —6; k, = (14.19) 
1 


s=1 \ k= 


The linear independence of the basis {e, } requires that 


N 
XLI So: (14.20) 
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A similar argument yields 
N A 
SIT eo (14.21) 


In matrix language we see that (14.20) or (14.21) requires that the matrix of elements 1 be the 


inverse of the matrix of elements Te . It is easy to verify that the reciprocal bases are related by 
N N — 
e= Tê, etsy Te (14.22) 


The covariant and contravariant components of v € V relative to the two pairs of reciprocal bases 
are given by 


N N N N 
v=) ve" = dive, =) 0%, = >) 0,6 (14.23) 


To obtain a relationship between, say, covariant components of v, one can substitute (14.22), into 
(14.23), thus 


This equation can be rewritten in the form 


N 


> [Sty -s,}@ =0 
z 


k=1 


and it follows from the linear independence of the basis fên) that 


N 
verie (14.24) 


In a similar manner the following formulas can be obtained: 
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(14.25) 
k=1 
N 
vis) Tv" (14.26) 
q=1 
and 
N E 
y= T, (14.27) 
q=1 
Exercises 
14.1 Show that the quantities T? and T introduced in formulas (14.17) and (14.18) are given 
by the expressions 
T? =ê, e7, Tx =e, ê (14.28) 
14.2 Derive equation (14.21). 
14.3 Derive equations (14.25)-(14.27). 
14.4 Given the change of basis in a three-dimensional inner product space ¥ , 
f,=2e,-e,-e,, f,=-e +e, f,=4e,-—e,+6e, 
find the quantities T? and T*. 
14.5 Given the vector v =e, +e, +e, in the vector space of the problem above, find the 
covariant and contravariant components of v relative to the basis {f,,f,,f,}. 
14.6 Show that under the basis transformation (14.17) 
N — 
êj =e, ê; = ` Tee, 
s,q=1 
and 
k k L AA 
‘es S Jp54 
e` =e -e = 27, ge 
S,q=. 
14.7 Prove Theorem 14.3. 


Chapter 4 
LINEAR TRANSFORMATIONS 


Section 15 Definition of Linear Transformation 


In this section and in the other sections of this chapter we shall introduce and study a special 
class of functions defined on a vector space. We shall assume that this space has an inner 
product, although this structure is not essential for the arguments in Sections 15-17. If W and % 
are vector spaces, a linear transformation is a function A: ¥ — W such that 


forallusve¥Y and 4€¢%. Condition (a) asserts that A is a homomorphism on W with respect 
to the operation of addition and thus the theorems of Section 6 can be applied here. Condition 
(b) shows that A , in addition to being a homomorphism, is also homogeneous with respect to the 
operation of scalar multiplication. Observe that the + symbol on the left side of (a) denotes 
addition in% , while on the right side it denotes addition in Y. It would be extremely 
cumbersome to adopt different symbols for these quantities. Further, it is customary to omit the 
parentheses and write simply Au for A(u) when A is a linear transformation. 


Theorem 15.1. If A:¥ — % isa function from a vector space Y to a vector space %, then 
A is a linear transformation if and only if 


A(Au+ uv) = AA(u) + uA (v) 
forall uve% and A,we@. 


The proof of this theorem is an elementary application of the definition. It is also 
possible to show that 


A(A'v, + Av, +++ AVR) = AAV, + AV, +++ A" AV, (15.1) 
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for all v,,..., Vg} €E% and A’,....A° EF. 
By application of Theorem 6.1, we see that for a linear transformation A: V¥V > & 
A0=0 and A(-v)=—Av (15.2) 
Note that in (15.2); we have used the same symbol for the zero vector in Y asin &. 


Theorem 15.2. If {v,,V,,...,V,}is a linearly dependent set in Y and if A: Y > Wis a linear 


transformation, then {Av,,Av,,...,AV,} is a linearly dependent set in Y . 


Proof Since the vectors v,,...,V, are linearly dependent, we can write 


yA, =0 
jal 


where at least one coefficient is not zero. Therefore 


(Sa, = SWAY, =0 
j=l 


j=l 
where (15.1) and (15.2) have been used. The last equation proves the theorem. 


If the vectors v,,...,V, are linearly independent, then their image set {Av,, Av,,..., AV B 


may or may not be linearly independent. For example, A might map all vectors into 0. The 
kernel of a linear transformation A: ¥ — % is the set 


K(A)= {v|Av = 0} 
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In other words, K (A )is the preimage of the set {0} in%. Since {0} is a subgroup of the 


additive group Y, it follows from Theorem 6.3 that K (A) is a subgroup of the additive group ¥. 


However, a stronger statement can be made. 


Theorem 15.3. The set K (A) is a subspace of ¥. 


Proof. Since K(A) is a subgroup, we only need to prove that if v € K (A), then Av e K(A) 
for all Aeg. This is clear since 


A(Av) = /2Av =0 


for all v in K(A). 


The kernel of a linear transformation is sometimes called the null space. The nullity of a 
linear transformation is the dimension of the kernel, i.e., dimK(A). Since K(A) is a subspace 


of ¥, we have, by Theorem 10.2, 
dim K (A) < dim% (15.3) 
Theorem 15.4. A linear transformation A :% — % is one-to-one if and only if K (A) = {0}. 


Proof. This theorem is just a special case of Theorem 6.4. For ease of reference, we shall repeat 
the proof. If Au = Av, then by the linearity of A, A(u-v)=0. Thus if K (A) = {0}, then 


Au = Av implies u= v, so that A is one-to-one. Now assume A is one-to-one. Since K (A) 


is a subspace, it must contain the zero in Y and therefore A0=0. If K (A) contained any other 
element v, we would have Av = 0, which contradicts the fact that A is one-to-one. 


Linear transformations that are one-to-one are called regular linear transformations. The 
following theorem gives another condition for such linear transformations. 


Theorem 15.5. A linear transformation A: Y — % is regular if and only if it maps linearly 
independent sets in Y to linearly independent sets in %. 
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Proof. Necessity. Let {v,,V,,...,V,} be a linearly independent set in ¥ and A:¥ —>% bea 
regular linear transformation. Consider the sum 


This equation is equivalent to 


Since A is regular, we must have 
R . 
d4/v, =0 
j=l 


Since the vectors v,,Vv,,...,V, are linearly independent, this equation shows 


A' = A* =---= A? =0, which implies that the set {Av,,...,Av,} is linearly independent. 


Sufficiency. The assumption that A preserves linear independence implies, in particular, that 
Av #0 for every nonzero vector v € ¥ since such a vector forms a linearly independent set. 
Therefore K (A) consists of the zero vector only, and thus A is regular. 


For a linear transformation A: ¥ + YW we denote the range of A by 


R(A)={Avive¥} 


It follows form Theorem 6.2 that R(A) is a subgroup of %. We leave it to the reader to prove 
the stronger results stated below. 


Theorem 15.6. The range R(A) is a subspace of %. 
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We have from Theorems 15.6 and 10.2 that 


dim R(A) < dim% (15.4) 


The rank of a linear transformation is defined as the dimension of R(A), i.e., dimR(A). A 
stronger statement than (15.4) can be made regarding the rank of linear transformation A. 


Theorem 15.7. dimR(A)< min(dimY,dim%). 


Proof. Clearly, it suffices to prove that dim R(A) < dim% , since this inequality and (15.4) 
imply the assertion of the theorem. Let {e,,€,,...,ey} be a basis forY , where N =dimy. 


Then ve% can be written 


Therefore any vector Av € R(A)can be written 


Av= vide, 


j=l 
Hence the vectors {Ae,, Ae,,...,Ae,} generate R(A). By Theorem 9.10, we can conclude 
dimR(A)<dimy (15.5) 
A result that improves on (15.5) is the following important theorem. 


Theorem 15.8. If A:Y — % isa linear transformation, then 


dim ¥ = dim R(A)+dimK(A) (15.6) 
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Proof. Let P=dimK(A), R=dimR(A), and N =dim%. We must prove that N = P +R. 
Select N vectors in YW such that fep eE} is a basis for K(A) and 

fe, e3... €p €p: €y } a basis for Y. As in the proof of Theorem 15.7, the vectors 

{Ae,, Ae,,...,Ae,, Ae 
Ae, = Ae, =---=Ae, =0. Thus the vectors {Ae,,,,Ae€p,),-.-,Aey} generate R(A). If we can 


.,Ae,} generate R(A). But, by the properties of the kernel, 


PLIT 


establish that these vectors are linearly independent, we can conclude from Theorem 9.10 that 
the vectors {Ae,,,,Ae,,,,..., A€y } form a basis for R(A) and that dim R(A)=R=N-P. 


P+1? 
Consider the sum 


R . 
> AAe,,, =0 


j=1 


Therefore 


which implies that the vector SA e,,,€K(A). This fact requires that 


P+j 


A =A? =-= A? =0, or otherwise the vector ya e,,, could be expanded in the basis 


P+j 
{e,,€,,...,€p}, contradicting the linear independence of {e,,e,,...,¢}. Thus the set 


{Ae,,,,Aep,,,..-,Ae,} is linearly independent and the proof of the theorem is complete. 


As usual, a linear transformation A: Y > % is said to be onto if R(A)=4, i.e., for 
every vector u € & there exists ve VY such that Av =u. 


Theorem 15.9. dimR(A) =dim if and only if A is onto. 


In the special case when dim¥ = dim %, it is possible to state the following important 
theorem. 


Theorem 15.10. If A:¥% > % isa linear transformation and if dimY = dim %, then A isa 
linear transformation onto % if and only if A is regular. 
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Proof. Assume that A: ¥ + % is onto Y, then (15.6) and Theorem 15.9 show that 
dim ¥ = dim X = dim Y + dim K (A) 


Therefore dim K(A)=0 and thus K(A) = 0} and A is one-to-one. Next assume that A is 
one-to-one. By Theorem 15.4, K(A) ={0}and thus dim K(A)=0. Then (15.6) shows that 


dim ¥ = dim R(A) = dim% 


By using Theorem 10.3, we can conclude that 


R(A)=% 


and thus A is onto. 


Exercises 


15.1 Prove Theorems 15.1, 15.6, and 15.9 
15.2 Let A: ¥ — % be a linear transformation, and let Y'be a subspace of Y. The 


restriction of A tow’ is a function A Pi yY —> U defined by 


A 


yVaAv 


for all ve ¥'. Show that A|_, is a linear transformation and that 
y 


K(A 


p)ak(Ajay 


15.3 Let A:¥ > % bea linear transformation, and define a function A: ¥ / K (A) —> by 


Av = Av 
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15.4 


15.5 


15.6 
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for all ve¥Y. Here Vdenotes the equivalence set of v in ¥/K(A). Show that A isa 


linear transformation. Show also that A is regular and that R(A) = R(A). 


Let Y'be a subspace of Y , and define a function P: YW > Y/¥'by 


Pv=Vv 
for all v in Y. Prove that P is a linear transformation, onto, and that K(P)=%*. The 


mapping P is called the canonical projection from ¥Y to V/V". 
Prove the formula 

dim ¥ =dim(¥/¥")+dim/" (15.7) 
of Exercise 11.5 by applying Theorem 15.8 to the canonical projection P defined in the 


preceding exercise. Conversely, prove the formula (15.6) of Theorem 15.8 by using the 
formula (15.7) and the result of Exercise 15.3. 


Let % and Y be vector spaces, and let Y ®% be their direct sum. Define mapping 
P:YOV >U and P,: YOY >Y by 


P(u,v)=u, P,(u,v)=v 


for all ue % and ve ¥. Show that P and P, are onto linear transformations. Show 
also the formula 


dim (V ® V) -dim X + dim% 


of Theorem 10.9 by using these linear transformations and the formula (15.6). The 
mappings P and P, are also called the canonical projections from YOY to Wand ¥ , 


respectively. 
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Section 16. Sums and Products of Liner Transformations 


In this section we shall assign meaning to the operations of addition and scalar 
multiplication for linear transformations. If A and B are linear transformations ¥ > % , then 
their sum A+B is a linear transformation defined by 


(A+B)v=Av+ By (16.1) 


for all ve ¥Y. In a similar fashion, if A ¢¢, then AA is a linear transformation Y — % defined 
by 


(AA)v =1(Av) (16.2) 


forall vey. If we write Z (x RA ) for the set of linear transformations from Y to % , then 
(16.1) and (16.2) make ¥(Y;%) a vector space. The zero element in ¥(¥;%) is the linear 
transformation 0 defined by 


Ov =0 (16.3) 


for all ve ¥Y. The negative of Ac £(¥;%) isa linear transformation -A € ¥(¥;%) defined 
by 


-A=-1A (16.4) 


It follows from (16.4) that -A is the additive inverse of Ac ¥(Y;%). This assertion follows 


from 


A+(-A)=A+(-1A)=1A +(-1A)=(1-1)A=0A=0 (16.5) 
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where (16.1) and (16.2) have been used. Consistent with our previous notation, we shall write 


A-—B for the sum A + (-B) formed from the linear transformations A and B. The formal 


proof that ¥(¥;%) is a vector space is left as an exercise to the reader. 


Theorem 16.1. dim Z (%; UY) =dimy dim 4. 


Proof. Let {e,,....€y}be a basis for Y and {b,,...,b,,} be a basis for% . Define NM linear 


transformations A‘: ¥ > % by 


A‘e,=b,, k=1...,N; a@=1,...,M 


16.6 
Aje,=0, k#p eu 


If A is an arbitrary member of #(¥;%), then Ae, € %, and thus 


Based upon the properties of A“ , we can write the above equation as 


M M N 
Ae, = YA A‘e, = >> Av Ase, 
a=l1 


a=1 s=1 


Therefore, since the vectors e,,...,e, generate Y , we find 


for all vectors ve Y. Thus, from (16.3), 


M N 


A=) AA; (16.7) 


a=1 s=1 
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This equation means that the MN linear transformations A$ (s =1,...,N;a=1,...,.M ) generate 


L(V;U ) . If we can prove that these linear transformations are linearly independent, then the 
proof of the theorem is complete. To this end, set 


ai 


1 s=1 


Me 


QR 
Il 


Then, from (16.6), 


Thus A; =0, (p zba N a= 1,.., M) because the vectors b,,...,b,, are linearly independent 
in Y. Hence {A \ is a basis of Y(Y;%). As a result, we have 


dim ¥(¥;%) = MN =dim¥dimy (16.8) 


If A:¥ > WY and B: YW are linear transformations, their product is a linear 
transformation VY > W, written BA, defined by 


BAv = B(Av) (16.9) 
for all ve ¥. The properties of the product operation are summarized in the following theorem. 
Theorem 16.2. 

C(BA)=(CB)A 


(AA + uB)C = AAC + uBC (16.10) 
C(AA + uB) = ACA + uCB 
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for all A,4.€¢ and where it is understood that A,B,andC are defined on the proper vector 
spaces so as to make the indicated products defined. 


The proof of Theorem 16.2 is left as an exercise to the reader. 


Exercises 
16.1 Prove that Z (4%; UY) is a vector space 
16.2 Prove Theorem 16.2. 
16.3 Let ¥,¥,and W be vector spaces. Given any linear mappings A: ¥ > WY and 
B: %— W,, show that 
dim R(BA) < min(dim R(A),dim R(B)) 
16.4 Let A:¥ >¥Y bea linear transformation and define A: //K (A) — Was in Exercises 


15.3. If P: >V IK (A) is the canonical projection defined in Exercise 15.4, show 
that 


This result along with the results of Exercises 15.3 and 15.4 show that every linear 
transformation can be written as the composition of an onto linear transformation and a 
regular linear transformation. 
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Section 17. Special Types of Linear Transformations 


In this section we shall examine the properties of several special types of linear 
transformations. The first of these is one called an isomorphism. In section 6 we discussed 
group isomorphisms. A vector space isomorphism is a regular onto linear transformation 
A:¥ +4. It immediately follows from Theorem 15.8 that if A: ¥ —> % is an isomorphism, 
then 


dim% = dim% (17.1) 


An isomorphism A: ¥ — % establishes a one-to-one correspondence between the elements of 
y and . Thus there exists a unique inverse function B: ¥— Y with the property that if 


u = Av (17.2) 
then 


v=B(u) (17.3) 


forall uc Vand ve% . We shall now show that B is a linear transformation. Consider the 
vectors u, and u, € % and the corresponding vectors [as a result of (17.2) and (17.3)] v, and 


v, €% . Then by (17.2), (17.3), and the properties of the linear transformation A , 


B(Au, + 4u, ) = B(AAv, + wAv, ) 
= B(A(Av, + uv,)) 
=AV,+ UV, 
= AB(u,)+ “B(u, ) 


Thus B is a linear transformation. This linear transformation shall be written A™. Clearly the 
linear transformation A™ is also an isomorphism whose inverse is A ; i.e., 


(a+) =A (17.4) 
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Theorem 17.1. If A:¥ > Y and B: — Ware isomorphisms, then BA: ¥ > W is an 
isomorphism whose inverse is computed by 


(BA) =A'B" (17.5) 


Proof. The fact that BA is an isomorphism follows directly from the corresponding properties 
of A and B. The fact that the inverse of BA is computed by (17.5) follows directly because if 


then 


Thus 


Therefore ((BA)" — A'B")w =0 for all we% which implies (17.5). 


The identity linear transformation I: ¥ > ¥ is defined by 
Iv=v (17.6) 


for all v in Y. Often it is desirable to distinguish the identity linear transformations on different 
vector spaces. In these cases we shall denote the identity linear transformation by I, . It follows 


from (17.1) and (17.3) that if A is an isomorphism, then 
AA“=I, and A A=I, (17.7) 


Conversely, if A is a linear transformation from Y to %, and if there exists a linear 
transformation B: ¥— ¥Y such that AB = I, andBA = 1,, then A is an isomorphism and 
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B=A"'. The proof of this assertion is left as an exercise to the reader. Isomorphisms are often 
referred to as invertible or nonsingular linear transformations. 


A vector space % and a vector space % are said to be isomorphic if there exists at least 
one isomorphism from ¥ toY. 


Theorem 17.2. Two finite-dimensional vector spaces Y and % are isomorphic if and only if 
they have the same dimension. 


Proof. Clearly, if Y and % are isomorphic, by virtue of the properties of isomorphisms, 
dim¥Y -dim&Y. If % and W have the same dimension, we can construct a regular onto linear 
transformation A: ¥ — Was follows. If {e,,...,ey} is a basis for Y and {b,,...,b,} is a basis 


for % , define A by 
Ae, =b,, k =1,...,N (17.8) 


Or, equivalently, if 


then define A by 


Av=}J v'b, (17.9) 


A is regular because if Av = 0, then v = 0. Theorem 15.10 tells us A is onto and thus is an 
isomorphism. 


As a corollary to Theorem 17.2, we see that Y and the vector space ¢ N where 
N =dimY, are isomorphic. 


In Section 16 we introduced the notation Z (%;%)for the vector space of linear 
transformations from Y to Y. The set Y(¥;¥%) corresponds to the vector space of linear 


transformations Y > ¥ . An element of #(¥;¥ ) is called an endomorphism of ¥ . This 


nomenclature parallels the previous usage of the word endomorphism introduced in Section 6. If 
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an endomorphism is regular (and thus onto), it is called an automorphism. The identity linear 
transformation defined by (17.6) is an example of an automorphism. If Ac 2(V;Y ), then it is 


easily seen that 
AI=IA=A (17.10) 


Also, if A and are inY(V;Y ) , it is meaningful to compute the products AB and BA ; 


however, 


AB + BA (17.11) 


in general. For example, let Y be a two-dimensional vector space with basis {e,,e)} and define 
A and B by the rules 


2 2 
Ae, = > A'e, and Be, =} B'e, 
j=l j=l 


where A’, and B!,. k, j =1,2, are prescribed. Then 


2 2 
BAe, = >| A’,B'je, and ABe, =) B’,A’e, 


j=l j=l 


An examination of these formulas shows that it is only for special values of A’, and B’, that 
AB = BA. 


The set 2(V;Y ) has defined on it three operations. They are (a) addition of elements 
of ¥(¥;¥), (b) multiplication of an element of (%;%) by a scalar, and (c) the product of a 
pair of elements of £(Y;¥%). The operations (a) and (b) make ¥(¥Y;¥%) into a vector space, 
while it is easily shown that the operations (a) and (c) make ¥ (x ye ) into a ring. The structure 


of ¥(¥;¥) is an example of an associative algebra. 
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The subset of Z (%;% ) that consists of all automorphisms of Y is denoted by 4L (4). 
It is immediately apparent that ¥/(%) is not a subspace of (¥;%) , because the sum of two 


of its elements need not be an automorphism. However, this set is easily shown to be a group 
with respect to the product operation. This group is called the general linear group. Its identity 
element is I andif Ac 4Y (7), its inverse is A” € GL(V). 


A projection is an endomorphism P € #(¥;¥%) which satisfies the condition 


P’ =P (17.12) 
The following theorem gives an important property of a projection. 


Theorem 17.3. If P:¥ > ¥V isa projection, then 
Y =R(P)®K(P) (17.13) 
Proof. Let v be an arbitrary vector in YW . Let 
w=v-Pv (17.14) 


Then, by (17.12), Pw = Pv- P(Pv)=Pv—Pv=0. Thus, we K(P). Since PveR(P), 
(17.14) implies that 


Y =R(P)+K(P) 


To show that R(P) 1 K(P)={0}, let ue R(P)K(P). Then, since ue R(P) for some 


ve¥Y, u=Py. But, since u is also in K (P), 


0=Pu=P(Pv)=Pv=u 


which completes the proof. 


102 Chap.4 ° LINEAR TRANSFORMATIONS 


The name projection arises from the geometric interpretation of (17.13). Given any 
v€% , then there are unique vectors u € R(P) and we K(P) such that 


v=utw (17.15) 
where 


Pu=u and Pw=0 (17.16) 


Geometrically, P takes v and projects in onto the subspace R(P) along the subspace K (P). 
Figure 5 illustrates this point for Y = °. 


w =(I-P)w 


u = Pv 


Figure 5 


Given a projection P , the linear transformation I-— P is also a projection. It is easily 
shown that 


¥ =R(I-P)@K(I-P) 
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It follows from (17.16) that the restriction of P to R(P) is the identity linear transformation on 
the subspace R(P). Likewise, the restriction of I—P to K (P) is the identity linear 


transformation on K (P). Theorem 17.3 is a special case of the following theorem. 


Theorem 17.4. If P., k =1,...,R, are projection operators with the properties that 


P? =P, k=1,..,R 
(17.17) 
P,P, =9, k#q 
and 
R 
ISP (17.18) 
k=1 
then 
Y =R(P,)@R(P,)@---O R(P,) (17.19) 


The proof of this theorem is left as an exercise for the reader. As a converse of Theorem 
17.4, if Y has the decomposition 


V =V,0--OY, (17.20) 


then the endomorphisms P, :¥ — ¥ defined by 


P.V=V,; k =1,.., R (17.21) 
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where 


V=V,+V,+°::+V, (17.22) 


are projections and satisfy (17.18). Moreover, % = R(P,), k=1,...,R. 


Exercises 


17.1 Let A:¥ >V and B:&¥—-Y be linear transformations. If AB =I, then B is the right 
inverse of A. If BA = I, then B is the left inverse of A. Show that A is an 
isomorphism if and only if it has a right inverse and a left inverse. 

17.2 Show that if an endomorphism A of Y commutes with every endomorphism B of ¥ , 
then A is ascalar multiple of I. 

17.3 Prove Theorem 17.4 


17.4 An involution is an endomorphism L such that L’ =I. Show that L is an involution if 


and only if P = (L +I) is a projection. 


17.5 Consider the linear transformation Al, :W'—Y defined in Exercise 15.2. Show that if 
Y =¥'@®K(A), then A 
17.6 If A isa linear transformation from YW to % where dim% = dim %, assume that there 
exists a linear transformation B: % —> ¥ such that AB =I (orBA=I). Show that A is 


ks 4 . 1 
„ is am isomorphism from ¥" to R(A). 


an isomorphism and that B= A”. 
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Section 18. The Adjoint of a Linear Transformation 


The results in the earlier sections of this chapter did not make use of the inner product 
structure on% . In this section, however, a particular inner product is needed to study the adjoint 
of a linear transformation as well as other ideas associated with the adjoint. 


Given a linear transformation A: Y > % , a function A*: Y— Y is called the adjoint of 
A if 


u-(Ay) = (A‘u)-v (18.1) 


forall ve% and ue Y%. Observe that in (18.1) the inner product on the left side is the one in 

U , while the one for the right side is the one in Y . Next we will want to examine the properties 
of the adjoint. It is probably worthy of note here that for linear transformations defined on real 
inner product spaces what we have called the adjoint is often called the transpose. Since our 
later applications are for real vector spaces, the name transpose is actually more important. 


Theorem 18.1. For every linear transformation A: Y — % , there exists a unique adjoint 
A*:&Y—Y satisfying the condition (18.1). 


Proof. Existence. Choose a basis {e,,...,e,} for Y anda basis {b,,...,b,,}for Y. Then A 


can be characterized by the M x N matrix [ A“. | in such a way that 
k=1,...,N (18.2) 


This system suffices to define A since for any v e% with the representation 


N 
ns k 
v= > vee, 
k=1 


the corresponding representation of Av is determined by | A’, [and |v“ [by 


106 Chap.4 ° LINEAR TRANSFORMATIONS 


Now let fe',...,e" band {b',....b™ }be the reciprocal bases of {e,,...,¢,} and {b,,...,Dy}, 


respectively. We shall define a linear transformation A* by a system similar to (18.2) except 
that we shall use the reciprocal bases. Thus we put 


N 
Ab = > Ae (18.3) 
k=1 
where the matrix Ea is defined by 


ATSA (18.4) 


for all a =1,...,M_and k =1,...,N . For any vector u €% the representation of A*u relative to 


N 
gey 
k=1 \ a=1 


where the representation of u itself relative to {b',..., b”) is 


fet... } is then given by 


Au= 


Having defined the linear transformation A“, we now verify that A and A‘ satisfy the relation 
(18.1). Since A and A” are both linear transformations, it suffices to check (18.1) for u equal to 
an arbitrary element of the basis {b',..., b” a say u = b“ , and for v equal to an arbitrary element 


of the basis {e,,...,e,}, say v=e,. For this choice of u and v we obtain from (18.2) 
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M M 
b” - Ae, =b". X A" b, =} A405 = A%, (18.5) 
B= B=1 
and likewise from (18.3) 
N N 
A’b’-e, = (Zare) e=) A eA (18.6) 
l=1 l=1 


Comparing (18.5) and (18.6) with (18.4), we see that the linear transformation A* defined by 
(18.3) satisfies the condition 


b° - Ae, = A’b* -e, 
forall æ =1,...,M_and k =1,..., N , and hence also the condition (18.1) for all ue % and ve% . 


Uniqueness. Assume that there are two functions A} : Y% >% and A} :%Y >% which 
satisfy (18.1). Then 


(Aju)-v =(Aju)-v =u: (Av) 


Thus 
(Aju - Aju) -v=0 


Since the last formula must hold for all v € ¥ , the inner product properties show that 
Aiu = Aju 
This formula must hold for every u € % and thus 


A‘ =A‘ 


1 2 
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As a corollary to the preceding theorem we see that the adjoint A* of a linear 
transformation A is a linear transformation. Further, the matrix Ea that characterizes A* 


by (18.3) is related to the matrix EA that characterizes A by (18.2). Notice the choice of 
bases in (18.2) and (18.3), however. 


Other properties of the adjoint are summarized in the following theorem. 


Theorem 18.2. 


(a) (A+B) =A* +B" (18.7): 

(b) (AB) = B*A* (18.7)2 

(c) (AA) =/A" (18.7)3 

(d) 0° =0 (18.7)4 

(e) =I (18.7)s 

(f) (A'J =A (18.7)¢ 
and 


(g) If A is nonsingular, so is A’ and in addition, 


A" =A” (18.8) 
In (a), Aand B are in (¥;%);in(b), Be Y(Y;%)and Ac Z(Y; W); in (c), AEF; 
in(d), 0 is the zero element; and in (e), I is the identity element in # (% SV ) . The proof of the 


above theorem is straightforward and is left as an exercise for the reader. 


Theorem 18.3. If A: ¥ — % isa linear transformation, then Y and % have the orthogonal 
decompositions 


Y =R(A‘)@®K(A) (18.9) 


and 
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U=R(A)®K(A`) (18.10) 
where 
R(a*)=K(A) (18.11) 
and 
R(A)=K(A*) (18.12) 


Proof. We shall prove (18.11) and (18.12) and then apply Theorem 13.4 to obtain (18.9) and 
(18.10). Let u be and arbitrary element in K (a`). Then for everyve VY , 


L 


u-(Av)= (A'u) -v=0. Thus K (a`) is contained in R(A) . Conversely, take ue R(A) ; 
then for every ve% , (A'u) ‘V=U- (Av) = 0, and thus (A'u) =0 , which implies that 
ue K(A‘) and that R(A)- is in K(A‘). Therefore, R(A)- = K(A‘) , which by Exercise 13.3 


implies (18.11). Equation (18.11) follows by an identical argument with A replaced by A*. As 
mentioned above, (18.9) and (18.10) now follow from Theorem 13.4. 


Theorem 18.4. Given a linear transformation A : Y — Y, then A and A* have the same rank. 
Proof. By application of Theorems 10.9 and 18.3, 


dim ¥ = dim R(A’)+dimK (A) 


However, by (15.6), 


dim Y = dim R(A)+dim K (A) 


Therefore, 


dim R(A) = dim R(A*) (18.13) 
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which is the desired result. 


An endomorphism A € #(Y;Y ) is called Hermitian if A = A‘ and skew-Hermitian if 


A =—A’. It should be pointed out, however, that the terms symmetric and skew-symmetric are 
often used instead of Hermitian and skew-Hermitian for linear transformations defined on real 
inner product spaces. The following theorem, which follows directly from the above definitions 
and from (18.1), characterizes Hermitian and skew-Hermitian endomorphisms. 


Theorem 18.5. An endomorphism A is Hermitian if and only if 
v, -(Av,) =(Av,)-v, (18.14) 
for all v,,v, €¥, and it is skew-Hermitian if and only if 
v,-(Av,) =—(Av,)-v, (18.15) 
forall v,,v, EV. 
We shall denote by #(¥;¥)and #(¥;¥) the subsets of #(%;%) defined by 
F(Y; )=[A]A E2 (757) and A=A'l 
and 
A(V;V)={A]AE2(7;V) and A =-A°} 


In the special case of a real inner product space, it is easy to show that 2 (4,4%) and £ (459) 
are both subspaces of ¥(Y;¥). In particular, Y(Y;¥) has the following decomposition: 


Theorem 18.6. For a real inner product space, 
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LVN )=S(ViV)OA(V:Y) (18.16) 


Proof. An arbitrary element L € ¥(Y;¥ ) can always be written 


L=S+A (18.17) 


where 


Here the superscript T denotes that transpose, which is the specialization of the adjoint for a real 
inner product space. Since Se. (¥;¥) and A € .#(¥;¥ ) , (18.17) shows that 


L(V;V)=F(V3V)+L(V;V) 


Now, let Be S(V;VY) A#(¥;¥). Then B must satisfy the conditions 


B=B' and B=-B’ 
Thus, B = 0 and the proof is complete. 


We shall see in the exercises at the end of Section 19 that a real inner product can be 
defined on ¥(¥;¥%) in such a fashion that the subspace of symmetric endomorphisms is the 


orthogonal complement of the subspace of skew-symmetric endomorphisms. For complex inner 
product spaces, however, /(¥;¥) and #~(¥;¥) are not subspaces of ¥(Y;¥). For 


example, if Ae 4(V%;7%), then iA isin #(Y;Y) because (iA) = jA* =-iA* =-iA, where 
(18.7)3 has been used. 


A linear transformation A € £(¥;%) is unitary if 
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Av, Av, = V, V} (18.18) 


for all v,,v, €% . However, for a real inner product space the above condition defines A to be 


orthogonal. Essentially, (18.18) asserts that unitary (or orthogonal) linear transformations 
preserve the inner products. 


Theorem 18.7. If A is unitary, then it is regular. 


Proof. Take v, =v, =v in (18.18), and by (12.1) we find 


avl = [vl (18.19) 
Thus, if Av = 0, then v= 0, which proves the theorem. 


Theorem 18.8. A € Z (%;U) is unitary if and only if ||Av||=||v| for all veY. 


Proof. If A is unitary, we saw in the proof of Theorem 18.7 that ||Av| = ||v||- Thus, we shall 


assume ||Ay||=||v|| for all v € ¥ and attempt to derive (18.18). This derivation is routine 
because, by the polar identity of Exercise 12.1, 


2Av,- AV, = |A (v, + v) +i |A (v, + iv,)| —(1+ DUSA + laval) 
Therefore, by (18.19), 
2Av, Av, =|, +v +ilv, iv E -aal a) = 27y 


This proof cannot be specialized directly to a real inner product space since the polar identity is 
valid for a complex inner product space only. We leave the proof for the real case as an exercise. 


If we require Y and % to have the same dimension, then Theorem 15.10 ensures that a 
unitary transformation A is an isomorphism. In the case we can use (18.1) and (18.18) and 
conclude the following 
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Theorem 18.9. Given a linear transformation A € #(Y;%), where dim% = dim; then A is 


unitary if and only if it is an isomorphism whose inverse satisfies 
A‘=A* (18.20) 


Recall from Theorem 15.5 that a regular linear transformation maps linearly independent 
vectors into linearly independent vectors. Therefore if {e,,...,e,} is a basis for Y and 


A € ¥(¥;%) is regular, then {Ae,,...,Ae,} is basis for R(A) which is a subspace in %. If 
{e,,...,€, } is orthonormal and A is unitary, it easily follows that {Ae,,...,Ae,} is also 


orthonormal. Thus the image of an orthonormal basis under a unitary transformation is also an 
orthonormal set. Conversely, a linear transformation which sends an orthonormal basis of W 


into an orthonormal basis of R(A) must be unitary. To prove this assertion, let ease R be 
orthonormal, and let {b,,...,b, } be an orthonormal set in Y, where b, = Ae,, k =1,...,N. 


Then, if v, and v, are arbitrary elements of ¥ , 


N N 
v, => v e and v, =} v'e, 
k=1 l=1 
we have 
N N 
Av, =} vb, and Av, =} v,‘b, 
k=1 l=1 
and, thus, 
N N N 
Av,: Av, = DAAN -b, = bA =V V, (18.21) 


k=1 l=1 k=1 
Equation (18.21) establishes the desired result. 


Recall that in Section 17 we introduced the general linear group 92 (4 ). We define a 
subset U(Y) of ZL (Y )by 
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U(v)=\A|AeGL(Y )andA™ = A‘} 


which is easily shown to be a subgroup. This subgroup is called the unitary group of% . 
In Section 17 we have defined projections P: ¥ — ¥ by the characteristic property 
P’ =P (18.22) 
In particular, we have showed in Theorem 17.3 that WY has the decomposition 
Y =R(P)®K(P) (18.23) 


There are several additional properties of projections which are worthy of discussion here. 


oy 


Theorem 18.10. If P is a projection, then P = P* <> R(P)=K(P) . 


Proof. First take P = P* and let v be an arbitrary element of% . Then by (18.23) 
v=utw 


where u = Pv and w=v-—Pv. Then 


where (18.22), (18.1), and the assumption that P is Hermitian have been used. Conversely, 
assume u-w=0 forall ue R(P) andall we K(P). Then if v, and v, are arbitrary vectors in 


V 
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Pv, - v, = Pv, -(Pv, + v, — Pv, ) = Pv, - Pv, 
and, by interchanging v, and v, 
Pv, : v, = Pv, - Pv, 


Therefore, 


Pv,- v, = Pv, -v, = v; (Pv, ) 


This last result and Theorem 18.5 show that P is Hermitian. 


Because of Theorem 18.10, Hermitian projections are called perpendicular projections. 
In Section 13 we introduced the concept of an orthogonal complement of a subspace of an inner 


product space. A similar concept is that of an orthogonal pair of subspaces. If % and ¥, are 


subspaces of ¥ , they are orthogonal, written ¥ 1 ¥,, if v,-v, =0 forall v €% and v, €% 


Theorem 18.11. If % and % are subspaces of ¥ , P, is the perpendicular projection of Y onto 


y,, and P, is the perpendicular projection of Ponto ¥,, then ¥ L% if and only if P,P =0. 


Proof. Assume that ¥% L% ; then Pv e %~ for all ve% and thus P,Pv=0 forall ve% 
which yields P,P, = 0. Next assume P,P, = 0; this implies that Pv ¢%~ for everyveY. 


Therefore % is contained in ¥,~ and, as a result, K 1%. 


Exercises 


18.1 Prove Theorem 18.2 
1 
18.2 For areal inner product space, prove that dim 4(Y;V) = 5 N(N +1) and 


din. #(V;V) =5N(N-1) , where N =dimy . 


18.3 Define O:£(V;V)> L(Y) and P: L(V) > L(Y) by 
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18.4 


18.5 


18.6 


18.7 


18.8 
18.9 


18.10 


18.11 


where ¥ is areal inner product space. Show that ® and ¥ are projections. 


Let ¥ and %⁄ be subspaces of Y and let P, be the perpendicular projection of Y onto 
¥, and P, be the perpendicular projection of Y onto ¥,. Show that P, +P, isa 
perpendicular projection if and only if¥ 1%. 

Let P, and P, be the projections introduced in Exercise 18.4. Show that P, —P, isa 
projection if and only if % is a subspace of% . 

Show that Ac (¥;¥%) is skew-symmetric <> v-Av=0 for all v €% , where ¥ is real 


vector space 
A linear transformation A € Z (Y SV ) is normal if A*A = AA“. Show that A is normal 


if and only if 


Av, - Av, = A’v,-A’V, 


forall v,,v, inv. 


Show that v- ((A+A‘)v) is real for every v €% and every A € L(V;7V). 


Show that every linear transformation A € 2(V;Y ), where ¥Y is a complex inner 
product space, has the unique decomposition 


A=B+iC 


where Band C are both Hermitian. 


Prove Theorem 18.8 for real inner product spaces. Hint: A polar identity for a real inner 
product space is 
2u-v =a + vif -lalf -v 


Let A be an endomorphism of ° whose matrix relative to the standard basis is 


T 251 
4 6 3 
1 00 
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Whatis K(A)? Whatis R (a7) ? Check the results of Theorem 18.3 for this particular 


endomorphism. 
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Section 19. Component Formulas 


In this section we shall introduce the components of a linear transformation and several 
related ideas. Let A € 7(Y;%), {e,,...,€y } a basis forY , and {b,,...,b,,} a basis for Y. The 


vector Ae, isin Y and, as a result, can be expanded in a basis of % in the form 
M 
Ae, =} A,“b, (19.1) 
a=1 


The MN scalars A,“ (@=1,...,M;k =1,...,N) are called the components of A with respect to 
the bases. If {b', is Ta is a basis of which is reciprocal to {b,,...,b,,}, (19.1) yields 


A”, =(Ae,)- b? (19.2) 


Under change of bases in Y and % defined by [cf. (14.18) and (14.22),] 


N 
e=) fê, (19.3) 
j=l 
and 
M aA 
b? = > sb” (19.4) 
p=1 


AY, = ESSA fy (19.5) 


where 
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A’, =(Aé,)-b? 


If A € Z (%39), (19.1) and (19.5) specialize to 


N 
Ae, = Ate, (19.6) 
q=l 
and 
N A ^A. 
A= TIA T (19.7) 


s,j=1 


The trace of an endomorphism is a function tr: Z (4; ) — ¢ defined by 
N 
rA=> A‘, (19.8) 
k=1 


It easily follows from (19.7) and (14.21) that tr A is independent of the choice of basis of Y . 
Later we shall give a definition of the trace which does not employ the use of a basis. 


If Ac £(¥;%), then A* € Z(Y; y). The components of A“ are obtained by the same 
logic as was used in obtaining (19.1). For example, 


N 
Ab =) Are (19.9) 
k=1 


where the MN scalars A’,” (k =1,..,N;a=1,...,M ) are the components of A” with respect to 
{b',...,b™ \ and fe, e" \. From the proof of Theorem 18.1 we can relate these components of 


A* to those of A in (19.1); namely, 


120 Chap.4 * LINEAR TRANSFORMATIONS 

Att = Az, (19.10) 
If the inner product spaces % and ¥ are real, (19.10) reduces to 

A’ * =A’ (19.11) 


S $ 


By the same logic which produced (19.1), we can also write 


M 
Ae, = $ A,,b” (19.12) 
a=1 
and 
M M 
Ae, = > A“b, = > A,‘b* (19.13) 
a=1 a=1 


If we use (14.6) and (14.9), the various components of A are related by 


N M N M 
AY = yA" ek = Db As es E > Anab” (19.14) 
=1 B=1 s=1 p=1 
where 
= S oB pa p _ Lpa 
biz =b, ‘b, = Dog and b? =b’-b” =b (19.15) 


A similar set of formulas holds for the components of A*. Equations (19.14) and (19.10) can be 
used to obtain these formulas. The transformation rules for the components defined by (19.12) 
and (19.13) are easily established to be 


M N -——., ; 
A,, =(Ae,)-b, = Ses, K (19.16) 
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A” =(Ae‘)-b* = SyisAeire (19.17) 
Bel j=l 
and 
M NN) __ 
A,‘ =(Ae")-b, =X 9 S?A,/TS (19.18) 
B=1 j=l 
where 
A, =(Aé,)-b, (19.19) 
Al! =(Aé’)-b? (19.20) 
and 
A, =(Aé’)-b, (19.21) 


The quantities S £ introduced in (19.16) and (19.18) above are related to the quantities S; by 
formulas like (14.20) and (14.21). 


Exercises 


19.1 Express (18.20), written in the form A*A =I, in components. 
19.2 Verify (19.14) 
19.3 Establish the following properties of the trace of endomorphisms of ¥ : 
trI = dim% 
tr(A+B)=trA+trB 
tr(AA) = AtrA 
tr AB=trBA 
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19.4 


19.5 


19.6 
19.7 


19.8 


19.9 
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and 


trA* =trA 


If A,Be “(¥;%), where ¥ is an inner product space, define 


A-B=trAB’ 


Show that this definition makes Y(Y;¥ ) into an inner product space. 


In the special case when ¥ is a real inner product space, show that the inner product of 
Exercise 19.4 implies that v(V;V) L S(V;V). 


Verify formulas (19.16)-(19.18). 
Use the results of Exercise 14.6 along with (19.14) to derive the transformation rules 
(19.16)-(19.18) 


Given an endomorphism A € # (Y W ), show that 


where 4° = (Ae*) -e, . Thus, we can compute the trace of an endomorphism from (19.8) 


or from the formula above and be assured the same result is obtained in both cases. Of 
course, the formula above is also independent of the basis of Y . 


Exercise 19.8 shows that tr A can be computed from two of the four possible sets of 
components of A. Show that the quantities ae A“ and >... Ax are not equal to each 


other in general and, moreover, do not equal tr A. In addition, show that each of these 
quantities depends upon the choice of basis of ¥ . 


19.10 Show that 


and 
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A", =(A’b,)-e = A,' 


a a 


Chapter 5 


DETERMINANTS AND MATRICES 


In Chapter 0 we introduced the concept of a matrix and examined certain manipulations one 
can carry out with matrices. In Section 7 we indicated that the set of 2 by 2 matrices with real 
numbers for its elements forms a ring with respect to the operations of matrix addition and matrix 
multiplication. In this chapter we shall consider further the concept of a matrix and its relation to a 
linear transformation. 


Section 20. The Generalized Kronecker Deltas and the Summation Convention 


Recall that a M x N matrix is an array written in any one of the forms 


[ae oe Ar PAL + Aw] 
A= =[4°;], A= =| A, | 

LA"; AN y | [Au Aun] 

A Ps e pn (20.1) 
A= =|A,'], A= =| a” | 

AD to ae AME. 2. A™ 


Throughout this section the placement of the indices is of no consequence. The components of the 


matrix are allowed to be complex numbers. The set of M x N matrices shall be denoted by M“ . 
It is an elementary exercise to show that the rules of addition and scalar multiplication insure that 


M“ is a vector space. The reader will recall that in Section 8 we used the set .@"*" as one of 
several examples of a vector space. We leave as an exercise to the reader the fact that the 


dimension of M“ isMN . 


In order to give a definition of a determinant of a square matrix, we need to define a 
permutation and consider certain of its properties. Consider a set of K elements fas. Qk k A 


permutation is a one-to-one function from {@%,....@,} to{@,,...,a@}. If o is a permutation, it is 


customary to write 
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a, a, . . . Qk 
o= í ) (20.2) 


olœ) O(a) > + + O(a) 


It is a known result in algebra that permutations can be classified into even and odd ones: The 
permutation o in (20.2) is even if an even number of pairwise interchanges of the bottom row is 
required to order the bottom row exactly like the top row, and o is odd if that number is odd. For 
a given permutation o , the number of pairwise interchanges required to order the bottom row the 
same as the top row is not unique, but it is proved in algebra that those numbers are either all even 
or all odd. Therefore the definition foro to be even or odd is meaningful. For example, the 


permutation 
1 2 3 
0 = 
2 -b3 


12 3 
g= 
2 3 1 


The parity of o , denoted by €, , is defined by 


is odd, while the permutation 


is even. 


+1if o is an even permutation 


-1 if o is an odd permutation 


All of the applications of permutations we shall have are for permutations defined on K (K < N) 
positive integers selected from the set of N positive integers {1,2,3,...,N}. Let {i,,...,i,} and 


{ji ję} be two subsets of {1,2,3,...,N}. If we order these two subsets and construct the two K - 


tuples (i,,...,i,) and(j,,..., ję), we can define the generalized Kronecker delta as follows: The 
generalized Kronecker delta, denoted by 


iti nig 
hija jK 


is defined by 
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0 if the integers (i,,...,i,) or (ji. jx) are not distinct 


0 if the integers (i,,...,i,) and (j,,..., jg ) are distinct 
but the sets {i,,...,i,} and {j,,..., j,} are not equal 


hlz LK 
JiJ2JK 


é, if the integers (i,,...,i,) and (j,,..., ję ) are distinct 


and the sets {i,,...,i,} and {j,,...,j,} are equal, where 
f E ) 
oO = . . . 
h h Pt Jk 
It follows from this definition that the generalized Kronecker delta is zero whenever the 


superscripts are not the same set of integers as the subscripts, or when the superscripts are not 
distinct, or when the subscripts are not distinct. Naturally, when K =1 the generalized Kronecker 


delta reduces to the usual one. As an example, ô}? has the values 
12 12 13 13 11 
6, =1, ó =-1 o65=0, 6,=0, 6, =0, etc. 


It can be shown that there are N!K!/(N — K)! nonzero generalized Kronecker deltas for given 
positive integers K and N . 


An € symbol is one of a pair of quantities 
ijip iy _ Shi..i _ 12...N 
Ge On" or E hjsi - Oi ty (20.3) 
For example, take N =3; then 


63 5 g?r = el he 1 


eg me el = es = if 


gP gl a pa BR 0, etc. 


As an exercise, the reader is asked to confirm that 


elcin 2 gioi (20.4) 
An identity involving the quantity 8# is 


lms 


N 
Yo, =(N -26 (20.5) 
q=1 
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To establish this identity, expand the left side of (20.5) in the form 


Şak =Ü + ON (20.6) 


Imq 
q=1 
Clearly we need to verify (20.5) for the cases i + j and {i s] } = {I j m} only, since in the remaining 


cases (20.5) reduces to the trivial equation 0=0. In the nontrivial cases, exactly two terms on the 
right-hand side of (20.6) are equal to zero: one for i =q and one for j=q. For i, j,!, and m not 


equal to q, 6;!, has the same value as 6;,. Therefore, 


Sait, = =(N -2)6/, 


Imq 
q=1 


which is the desired result (20.5). By the same procedure used above, it is clear that 


Xol =(N -D8 (20.7) 
j=1 
Combining (20.5)and (20.7), we have 
N N p 
D> oi =(N -2N -1)6; (20.8) 
j=l q=1 
Since ee = N, we have from (20.8) 
N N N N! 
gË = (N —2)(N -1)(N) = (20.9) 
222 w- 
Equation (20.9) is a special case of 
N NNE N! 
D gpk =——— (20.10) 
iiz ,nig=1 ma (N-K)! 
Several other numerical relationships are 
N r —R)! .. 
x Gioiirasik — (N R)! giir (20.11) 


Jie IRR lk (N E K)! Jie JR 


İk+pir+2»»İg =l 


N 
iecigiea ly S L KYI Sii 
& © jiis rlc (N K)!S (20.12) 
ikm »»İy=l 
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N 
»; gi dikiga- oa 5 = (N — K)! etide (20.13) 
+ 
iks1oealy =l 
N 
i .ikiky ig Ski JR — LStetxicsite 
> ô eJkjkae Op lp =(R- K)! Oj, e jglkslr (20.14) 
jka» jR=l 
N N _ l 
>. » ou ixixsa-tp gi JR — (N K)! R- KIS" İk (20 15) 
Jie Skike IR iksir N-R)! jie jK ` 
Jxsteodpaligstesigal ( )! 


It is possible to simplify the formal appearance of many of our equations if we adopt a 
summation convention: We automatically sum every repeated index without writing the 
summation sign. For example, (20.5) is written 

ôk = 
The occurrence of the subscript q and the superscript q implies the summation indicated in (20.5). 
We shall try to arrange things so that we always sum a superscript on a subscript. Itis important to 
know the range of a given summation, so we shall use the summation convention only when the 
range of summation is understood. Also, observe that the repeated indices are dummy indices in 
the sense that it is unimportant which symbol is used for them. For example, 

ij ijt ij 

Oims = One = Sima 
Naturally there is no meaning to the occurrence of the same index more than twice in a given term. 
Other than the summation or dummy indices, many equations have free indices whose values in the 
given range {bal } are arbitrary. For example, the indices j,k,] and mare free indices in 


(20.16). Notice that every term in a given equation must have the same free indices; otherwise, the 
equation is meaningless. 


The summation convention will be adopted in the remaining portion of this text. If we feel 
it might be confusing in some context, summations will be indicated by the usual summation sign. 
Exercises 


20.1 Verify equation (20.4). 
20.2 Prove that dim “@”" = MN. 


=(N-2)6!, (20.16) 
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Section 21. Determinants 


In this section we shall use the generalized Kronecker deltas and the ¢ symbols to define the 
determinant of a square matrix. 


The determinant of the N x N matrix A= [ Ai | , written det A, is a complex number 
defined by 


A, A» ee An 
Ay, Ay SG Ayn 
As, 
detA=| - = et A A Ay (21.1) 
Ay, Ay» Ayn 


where all summations are from 1 to N. If the elements of the matrix are written Ea , its 


determinant is defined to be 


Au AY’ r g ; AWN 
A? A” . S : ACN 

det A=| — =e, ANA? Ae (21.2) 
AN AN? 7 g f ANN 


Likewise, if the elements of the matrix are written |a i its determinant is defined by 


A A A‘, 
A A. a. 

det A=| — =g, AtA’, AN, (21.3) 
AY, A, AY y 
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A similar formula holds when the matrix is written A = | A? ] . The generalized Kronecker delta 


can be written as the determinant of ordinary Kronecker deltas. For example, one can show, using 


(21.3), that 


H 6 


lS) a 


Equation (21.4) is a special case of the general result 


i i 
ô; ô; 
iz i 
ô; 0; 
„ik : 
hjk 
ik 
6; 


It is possible to use (21.1)-(21.3) to show that 


Eji 


et det A=8, , AM. 


al 


and 


e, , detA=e, , A’ 
1 JN qe 


J 


It is possible to use (21.6)-(21.8) and (20.10) with N = K to show that 


1 hjn ohin 
daS EMA n Ayi 


deias Eg, e A ass 


N! hjn hin 


and 


det A=— gih At 
N! het J 


1 


Jn det A= E'A J 


-İN j 


= 515) — 518 


Alx Jin 


A’. 


JN 


(21.4) 


(21.5) 


(21.6) 


(21.7) 


(21.8) 


(21.9) 


(21.10) 


(21.11) 


Equations (21.9), (21.10) and (21.11) confirm the well-known result that a matrix and its transpose 
have the same determinant. The reader is cautioned that the transpose mentioned here is that of the 
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matrix and not of a linear transformation. By use of this fact it follows that (21.1)-(21.3) could 
have been written 


det A = 6%" A, +++ Ay, (21.12) 

det A= g, A" A“ (21.13) 
and 

det A= stn At ee AN (21.14) 


A similar logic also yields 


£p det As stn A A, (21.15) 
et det A= g, , AM. Als (21.16) 
and 
ghnis det A= err Ah ee AM, (21.17) 
The cofactor of the element A', in the matrix A= [A i is defined by 
cof A, S Eii Te A‘, ys SA OSA x Ala (21.18) 
As an illustration of the application of (21.18), let N =3 and s=t=1. Then 
cof A’, = £0 A1 ,A"; 
= EAA, 
= A,A’; = AA, 
Theorem 21.1. 
N N 
>A’, cof A’, = 6, det A and >) A‘, cof A’, = 6! det A (21.19) 


s=1 t=1 


Proof. It follows from (21.18) that 


Sec. 21 


where the definition (21.3) has been used. Equation (21.19)2 follows by a similar argument. 


Exercises 
21.1 Verify equation (21.4). 
21.2 Verify equations (21.6)-(21.8). 
21.3 Verify equationa (21.9)-(21.11). 
21.4 Show that the cofactor of an element of an N xN matrix A can be written 
i 1 jy J i i 
cof A’, = ON Ab ofa AN 
1 (N —1)! holy J2 
21.5 If A and B are NxN matrices use (21.8) to prove that 
det AB = det Adet B 
21.6 If I isthe NxN identity matrix show that det I =1. 
21.7 If A is a nonsingular matrix show, that det A + 0 and that 
det A” = ne 
detA 
21.8 


. Determinants 


N 

DA’ cof AY. = Ep p A" 
q t ily ...Ly 

s=1 


= i Egs iy i, teat P iy 
Einn A 1 A gA (A t+1 A 


hly ..ly 


= oe a 


= ô; det A 


wee Aler S i Aim A 
1 A aA AoA A t+1 


N 


i o.. Ala i Ala ... Ain t 
A SAT GA A A On 


Equations (21.19) represent the classical Laplace expansion of a determinant. In Chapter 0 
we promised to present a proof for (0.29) for matrices of arbitrary order. Equations (21.19) fulfill 
this promise. 


(21.20) 


If A isan NxN matrix, we define its K x K minor (1< K < N) to be the determinant of 


any K xK submatrix of A, e.g., 
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A}. -= . Ab 


J jk 


haig — : — Skink Qi... Alix 
Avy. = det Oj. A kı A kx 


(21.21) 


_ sh-ik ak... Akk 
= On gt h A‘; 


JK 


In particular, any element A’ ; of A isa 1x1minor of A. The cofactor of the K x K minor 
Atx is defined by 


Sie JK 


1 


cof Abi = Orin gira ee Al (21.22) 


hjk (N - K)! iiy jka JN 


which is equal to the (N — K)x(N — K) minor complementary to the K x K minor ane 
and is assigned an appropriate sign. Specifically, if (i,,,,...,iy) and (jkunu Jy) are 
complementary sets of (i,,...,i,,.) and (j,,....j,) in (1,..., N), then 


cof AM aor Anes (no summation) (21.23) 


he Jn jki jN 


Clearly (21.22) generalizes (21.20). Show that the formulas that generalize (21.19) to 
KxK minors in general are 


(21.24) 


isig 1 ` issik tc 
On det A= K! > Ak k cof Aes 
© ky woke =l 

j- j 1 ` k,...k k,...k, 

Jt JK = TKK 1e KK 

On det A= K!, 2, Ack cof Ay; 

21.9 If A is nonsingular, show that 
Ais = (det A) cof(A™)ji/« (21.25) 
and that 
1 Z i.i =1\kı...k hd 
=r ATA Sort (21.26) 
K! oe qi KK 1e JK ee JK 
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In particular, if K =1 and A is replaced by A‘ then (21.25) reduces to 


(A‘)', =(det A)‘ cof A’, (21.27) 
J I 


which is equation (0.30). 
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Section 22. The Matrix of a Linear Transformation 


In this section we shall introduce the matrix of a linear transformation with respect to a 
basis and investigate certain of its properties. The formulas of Chapter 0 and Section 21 are purely 
numerical formulas independent of abstract vectors or linear transformations. Here we show how a 
certain matrix can be associated with a linear transformation, and, more importantly, we show to 
what extent the relationship is basis dependent. 


If A € L(Y;%), {e,,...,e, bis a basis for Y, and {b,,...,b,,} is a basis for Y , then we can 
characterize A by the system (19.1): 


Ae, = A’,b, (22.1) 


where the summation is in force with the Greek indices ranging from 1 to M. The matrix of A with 
respect to the bases {e,,...,e, |and {b,,...,b,,}, denoted by M(A,e,,b,) is 


A A Ab 
A A, A, 
M(A,e,,b,)=| _ (22.2) 
[Ay Ars ard A” a] 


As the above argument indicates, the matrix of A depends upon the choice of basis for Y and %. 
However, unless this point needs to be stressed, we shall often write M(A) for the matrix of A 
and the basis dependence is understood. We can always regard M as a function 

M : Z(U;, y) —> A". It is a simple exercise to confirm that 


M(AA + WB) =4M(A)+ uM (B) (22.3) 
forall A,uwe¢ and A,Be Y(%;V). Thus M isa linear transformation. Since 
M(A)=0 


implies A=0, M is one-to-one. Since #(¥;Y) and ./”*" have the same dimension (see 
Theorem 16.1: and Exercise 20.2), Theorem 15.10 tells us that M is an automorphism and, thus, 
L(U;V) and 4" are isomorphic. The dependence of M(A,e,,b,,) on the bases can be 


exhibited by use of the transformation rule (19.5). In matrix form (19.5) is 


M(A,e,,b,)=SM(A,é,,b,)T (22.4) 
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where S is the M x M matrix 


Si S, Su 
S? S3 amo f S$, 

SS (22.5) 
Sr SF Su | 

and Tis the NxN matrix 

[pees Ty 
i a Ty 

Peal: (22.6) 
ee a ie kre 


Of course, in constructing (22.4) from (19.5) we have used the fact expressed by (14.20) and 
(14.21) that the matrix T™ has components Ti , jk=1,...N. 


If A is an endomorphism, the transformation formula (22.4) becomes 
M(A,e,,e,) =TM(A,é,,€,)T (22.7) 
We shall use (22.7) to motivate the concept of the determinant of an endomorphism. The 
determinant of M(A,e,,e,), written det M(A,e,,e,) , can be computed by (21.3). It follows from 


(22.7) and Exercises 21.5 and 21.7 that 


det M(A,e,,e,) = (det T )(det M(A,é,,€,))(detT’) 
= det M(A,é,,€, 


Thus, we obtain the important result that det M(A,e,,e,) is independent of the choice of basis for 


¥ . With this fact we define the determinant of an endomorphism A e ¥(¥;¥V), written det A, by 


det A = det M(A,e,,e,) (22.8) 
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By the above argument, we are assured that det A is a property of A alone. In Chapter 8, we shall 
introduce directly a definition for the determinant of an endomorphism without use of a basis. 


Given a linear transformation A € (y; %), the adjoint A* € Y(%;V) * is defined by the 
component formula (19.9) Consistent with equations (22.1) and (22.2), equation (19.9) implies that 


|a! AY y a N A" ] 
A, A, SH E A 
M(A’,b7,e‘) = | (22.9) 
| Ay we p . P Ay | 


Notice that the matrix of A’ is referred to the reciprocal bases fe' \ and {b°} . If we now use 
(19.10) and the definition (22.2) we see that 


M(A‘,b’,e‘) = M(A,e,,b,)" (22.10) 


where the complex conjugate of a matrix is the matrix formed by taking the complex conjugate of 
each component of the given matrix. Note that in (22.10) we have used the definition of the 

transpose of a matrix in Chapter 0. Equation (22.10) gives a simple comparison of the component 
matrices of a linear transformation and its adjoint. If the vector spaces are real, (22.10) reduces to 


M(A‘)=M(A)’ (22.11) 


where the basis dependence is understood. For an endomorphism A e 4(¥;¥), we can use 
(22.10) and (22.8) to show that 


det A” = det A (22.12) 


Givena M xN matrix A= | A“. | , there correspond M 1x N row matrices and N M x1 


column matrices. The row rank of the matrix A is equal to the number of linearly independent 
row matrices and the column rank of A is equal to the number of linearly independent column 
matrices. It is a property of matrices, which we shall prove, that the row rank equals the column 
rank. This common rank in turn is equal to the rank of the linear transformation whose matrix is 
A. The theorem we shall prove is the following. 


Theorem 22.1. A= [as] isan M xN matrix, the row rank of A equals the column rank of A. 
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Proof. Let Ac Y(¥;%) and let M(A)= A=| A”, | with respect to bases {e,,...,€; } for Y and 
k 1 N 


{b,,...,b,,} for Y. We can define a linear transformation B : Y > @" by 
Bu = (u',u’,...,u” ) 


where u=u“b,. Observe that B is an isomorphism. The product BA is a linear transformation 
Y — g" ; further, 


BAe, = B(A’,b,) = A“ ,Bb, = ( A',, Ay. A") 


Therefore BAe, is an M -tuple whose elements are those of the kth column matrix of A. This 
means dim R(BA)= column rank of A. Since B is an isomorphism, BA and A have the same 
rank and thus column rank of A= dim R(A). A similar argument applied to the adjoint mapping 


A’ shows that row rank of A= dim R(A*). If we now apply Theorem 18.4 we find the desired 
result. 


In what follows we shall only refer to the rank of a matrix rather than to the row or the 
column rank. 


Theorem 22.2. An endomorphism A is regular if and only if det A #0. 


Proof. If det A #0, equations (21.19) or, equivalently, equation (0.30), provide us with a formula 


for the direct calculation of A™ so A is regular. If A is regular, then A” exists, and the results 
of Exercises 22.3 and 21.7 show that det A #0. 


Before closing this section, we mention here that among the four component matrices 
[a° J [Ax]; [ A“ ], and [ 4," ] defined by (19.2), (19.16), (19.17), and (19.18), respectively, the 


first one, Ey il is most frequently used. For example, it is that particular component matrix of an 


endomorphism A that we used to define the determinant of A, as shown in (22.8). In the sequel, 
if we refer to the component matrix of a transformation or an endomorphism, then unless otherwise 
specified, we mean the component matrix of the first kind. 

Exercises 

22.1 If A and B are endomorphisms, show that 


M(AB) = M(A)M(B) and detAB=detAdetB 


22.2 If A:¥ +¥Y isaskew-Hermitian endomorphism, show that 
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det A = (-1)"™” det A 
22.3 If Ac ¥(¥;V) is regular, show that 
M(A")=M(A)' 
22.4 If Pe ¥(¥;V) isa projection, show that there exists a basis for Y such that 


1 


R=dimR(P) 


M(P)= 


L 0 


22.5 If Ac ¥(¥;VW)is Hermitian, show that the component version of this fact is 


A, = Ay Al. =A; and ¢, A’, =e, 


where e, =e; -e,, and where {e,,...,e,} is the basis used to express the transformation 


formulas (19.16)-(19,18) in matrix notation. 
22.6 If Ac ¥(¥;¥%) we have seen it has four sets of components which are denoted by A’, 


A',, A’, and A,. Prove that det| A, | and det| A” | do depend upon the choice of basis for 


y. 
22.7 Show that 


det(@A) = a™ det A (22.13) 
for all scalars «œ. In particular, if A =I, then from Exercise 21.6 


det(a1) = a” (22.14) 
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22.8 If A is an endomorphism of an N-dimensional vector space Y , show that det(A — AI) isa 
polynomial of degree Nin 1. 
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Section 23. Solution of Systems of Linear Equations 


In this section we shall examine the problem system of M equations in N unknowns of 
the form 


A‘ v“ =u’, &=1,.., M, k=1,..,N (23.1) 


where the MN coefficients A”, and the M data u” are given. If we introduce bases bases 
{e,,...,€,} for a vector space ¥ and {b,,...,b,,} for a vector space % , then (23.1) can be viewed as 


the component formula of a certain vector equation 
Av =u (23.2) 
which immediately yields the following theorem. 
Theorem 23.1. (23.1) has a solution if and only if u isin R(A). 
Another immediate implication of (23.2) is the following theorem. 
Theorem 23.2. If (23.1) has a solution, the solution is unique if and only if A is regular. 


Given the system system of equations (23.1), the associated homogeneous system is the set 
of equations 


A” wv“ =0 (23.3) 


Theorem 23.3. The set of solutions of the homogeneous system (23.3) whose coefficient matrix is 
of rank R form a vector space of dimension N-R. 


Proof. Equation (23.3) can be regarded as the component formula of the vector equation 
Av=0 


which implies that v* solves (23.3) if and only if v € K(A). By (15.6) 
dim K(A) =div¥ -dim R(A)= N-R. 


From Theorems 15.7 and 23.3 we see that if there are fewer equations than unknowns (i.e. 
M < N ), the system (23.3) always has a nonzero solution. This assertion is clear because 
N-R? N-M >0, since R(A) is a subspace of % and M = dim &. 


If in (23.1) and (23.3) M = N „ the system (23.1) has a solution for all u* if and only if 
(23.3) has the trivial solution v' = v? =---=v™ =0 only. For, in this circumstance, A is regular 
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and thus invertible. This means det| A’, | #0, and we can use (21.19) and write the solution of 
(23.1) in the form 


a 4 f A” a 
v a A* Ju (23.4) 


which is the classical Cramer's rule and is to N dimension of equation (0.32). 


Exercise 
23.1 Given the system of equations (23.1), the augmented matrix of the system is the matrix 
obtained from [a°] by the addition of a column formed from ut,...,u™. Use Theorem 


23.1 and prove that the system (23.1) has a solution if and only if the rank of [a°] equals 


the rank of the augmented matrix. 


Chapter 6 


SPECTRAL DECOMPOSITIONS 


In this chapter we consider one of the more advanced topics in the study of linear 
transformations. Essentially we shall consider the problem of analyzing an endomorphism by 
decomposing it into elementary parts. 


Section 24. Direct Sum of Endomorphisms 


If Ais an endomorphism of a vector space Y , a subspace % of Y is said to be A- 
invariant if A maps % to ¥%. The most obvious example of an A -invariant subspace is the null 
space K(A). Let A,,A,,...,A, be endomorphisms of ¥Y ; then an endomorphism A is the direct 
sum of A,,A,,...,A, if 


A=A +A, + +A; (24.1) 


and 


AA, =0 izj (24.2) 


For example, equation (17.18) presents a direct sum decomposition for the identity linear 
transformation. 


Theorem 24.1. If Ac Y(¥;%) and ¥ =% OY, ®---® ¥,, where each subspace ¥, is A - 


invariant, then A has the direct sum decomposition (24.1), where each A, is given by 


A,V, = Av, (24.3) 


forall v, €% and 


Av =0 (24.3)2 
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For all v EÑ, 14], for i =1,..., L. Thus the restriction of A to Y coincides with that of A p 


further, each % is A j -invariant, for all j =1,...,L. 


Proof. Given the decomposition ¥ = %4 ® ⁄ ®---®¥,, then ve% has the unique representation 
V=Vv,+-::+v,, where each v, €% . By the converse of Theorem 17.4, there exist L projections 
P,,...,P, (19.18) and also v, € P,(⁄). From (24.3), we have A, = AP,, and since ¥, is A- 
invariant, AP, = PA. Therefore, if i+ j, 


A,A, =APAP, = AAPP, =0 
where (17.17)2 has .been used. Also 


A,+A,+-::+A, =AP, + AP, +---+AP, 
= A(P,+P,+-::+P,) 
=A 


where (17.18) has been used. 


When the assumptions of the preceding theorem are satisfied, the endomorphism A is said 
to be reduced by the subspaces ¥,...,¥, . An important result of this circumstance is contained in 


the following theorem. 


Theorem 24.2. Under the conditions of Theorem 24.1, the determinant of the endomorphism A is 
given by 
det A = det A, det A, ---detA, (24.4) 


where A, denotes the restriction of Ato % for all k =1,...,L. 


The proof of this theorem is left as an exercise to the reader. 


Exercises 


24.1 Assume that Ac #(¥;V) is reduced by ¥,...,¥; and select for the basis of WY the union 
of some basis for the subspaces ¥,...,¥, . Show that with respect to this basis M (A) has 
the following block form: 


Sec. 24 e Direct Sum of Endomorphisms 147 


M(A)=|  '-----} (24.5) 


| 0 | A] 
where A, is the matrix of the restriction of A to Y . 


24.2 Use the result of Exercise 24.1 and prove Theorem 24.2. 
24.3 Show thatif V¥=4Y®O®¥,®---®¥,, then Ac L(y; V) is reduced by ¥%,...,¥, if and only if 


L 


AP, =P A for j =1,...,.L, where P, is the projection on ¥, given by (17.21). 
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Section 25 Eigenvectors and Eigenvalues 


Given an endomorphism A e #(¥;¥V), the problem of finding a direct sum decomposition 


of A is closely related to the study of the spectral properties of A. This concept is central in the 
discussion of eigenvalue problems. 


A scalar 4 is an eigenvalue of Ac “(¥;¥%) if there exists a nonzero vector ve Y such 
that 


Av = Av (25.1) 


The vector v in (25.1) is called an eigenvector of A corresponding to the eigenvalue 1. 
Eigenvalues and eigenvectors are sometimes refered to as latent roots and latent vectors, 
characteristic roots and characteristic vectors, and proper values and proper vectors. The set of 
all eigenvalues of A is the spectrum of A, denoted by o(A). For any 4 € o(A), the set 


¥ (A) =\vev|Av = Av} 


is a subspace of ¥ , called the eigenspace or characteristic subspace corresponding to 4. The 
geometric multiplicity of 2 is the dimension of ¥(A). 


Theorem 25.1. Given any Å € o(A), the corresponding eigenspace ¥ (A) is A -invariant. 


The proof of this theorem involves an elementary use of (25.1). Given any å € o(A), the 
restriction of v to ¥(A), denoted by A,,,,, has the property that 


A, ju = Au (25.2) 


for all u in ¥(A). Geometrically, A,,,, simply amplifies u by the factor 2. Such linear 


transformations are often called dilatations. 


Theorem 25.2. Ac ¥(¥;V) is not regular if and only if 0 is an eigenvalue of A; further, the 
corresponding eigenspace is K(A). 
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The proof is obvious. Note that (25.1) can be written as (A — AI)v = 0, which shows that 


¥ (A) = K(A-AD (25.3) 


Equation (25.3) implies that 4 is an eigenvalue of A if and only if A — AI is singular. Therefore, 
by Theorem 22.2, 


Aeo(A) © det(A— AD =0 (25.4) 


The polynomial f(/) of degree N = dim% defined by 


f(A) = det(A - Al) (25.5) 


is called the characteristic polynomial of A. Equation (25.5) shows that the eigenvalues of A are 
roots of the characteristic equation 


f(A) =0 (25.6) 


Exercises 


25.1 Let A be an endomorphism whose matrix M (A) relative to a particular basis is a 
triangular matrix, say 


Ay A, - + Ay 


M(A) = 


| 0 | ar, | 


What is the characteristic polynomial of A? Use (25.6) and determine the eigenvalues of 
A. 


25.2 Show that the eigenvalues of a Hermitian endomorphism of an inner product space are all 
real numbers and the eigenvalues of a skew-Hermitian endomorphism are all purely 
imaginary numbers (including the number 0 =i0). 


150 Chap 6 ° SPECTRAL DECOMPOSITIONS 


25.3 Show that the eigenvalues of a unitary endomorphism all have absolute value 1 and the 
eigenvalues of a projection are either 1 or 0. 


25.4 Show that the eigenspaces corresponding to different eigenvalues of a Hermitian 
endomorphism are mutually orthogonal. 


25.5 If Be ¥(V¥;V) is regular, show that B ‘AB and A have the same characteristic equation. 
How are the eigenvectors of BAB related to those of A? 
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Section 26 The Characteristic Polynomial 


In the last section we found that the eigenvalues of Ac Y(Y;V) are roots of the 
characteristic polynomial 


f(4)=0 


where 
f(A) =det(A — AT) 


If {e,,...,e,} is a basis for ¥ , then by (19.6) 


Ae, = A'e, 


J 


Therefore, by (26.2) and (21.11), 
Leh i i i i 
f(A) = ie (A’ omy A6}') con (A* a i Aò; ) 
If (26.4) is expanded and we use (20.11), the result is 


f(A) =(74)" + pA +--+ Hya A) + My 


where 


(26.1) 


(26.2) 


(26.3) 


(26.4) 


(26.5) 


(26.6) 


Since f(A) is defined by (26.2), the coefficients ;, j =1,...,N , are independent of the choice of 


basis for Y . These coefficients are called the fundamental invariants of A. Equation (26.6) 


specializes to 
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1 
l =trA, Ls =5{r a) -trA°}, and ly =detA (26.7) 


where (21.4) has been used to obtain (26.7). Since f(/) is a Nth degree polynomial, it can be 
factored into the form 


f(A) = (A, - Ay" (A, A)? + (A, -4)* (26.8) 


where 4,,...,4, are the distinct foots of f (4)=0 and d,,...,d, are positive integers which must 
satisfy ye ; =N. Itis in writing (26.8) that we have made use of the assumption that the scalar 


field is complex. If the scalar field is real, the polynomial (26.5), generally, cannot be factored. 


In general, a scalar field is said to be algebraically closed if every polynomial equation has 
at least one root in the field, or equivalently, if every polynomial, such as f (2), can be factored 


into the form (26.8). It is a theorem in algebra that the complex field is algebraically closed. The 
real field, however, is not algebraically closed. For example, if A is real the polynomial equation 


f(A) =A’? +1=0 has no real roots. By allowing the scalar fields to be complex, we are assured 
that every endomorphism has at least one eigenvector. In the expression (26.8), the integer d is 
called the algebraic multiplicity of the eigenvalue 4,. It is possible to prove that the algebraic 


multiplicity of an eigenvalue is not less than the geometric multiplicity of the same eigenvalue. 
However, we shall postpone the proof of this result until Section 30. 


An expression for the invariant 4, can be obtained in terms of the eigenvalues if (26.8) is 


expanded and the results are compared to (26.5). For example, 


4 =trA=d,A,4+d,/A,+---+d,A, 
(26.9) 
My =detA = AAP -- Att 


The next theorem we want to prove is known as the Cayley-Hamilton theorem. Roughly 
speaking, this theorem asserts that an endomorphism satisfies its own characteristic equation. To 
make this statement clear, we need to introduce certain ideas associated with polynomials of 
endomorphisms. As we have done in several places in the preceding chapters, if Ac Y(V;VY), 


A’ is defined by 
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Similarly, we define by induction, starting from 


A’ =I 
and in general 
A* = AA“ = A'YA 


where k is any integer greater than one. If k and l are positive integers, it is easily established by 
induction that 


A*A’ = A'A“ = A™ (26.10) 
Thus A“ and A’ commute. A polynomial in A is an endomorphism of the form 
g(A) = QA” +AA” ++ +ay A + ayl (26.11) 


where M is a positive integer and @,,...,@,, are scalars. Such polynomials have certain of the 
properties of polynomials of scalars. For example, if a scalar polynomial 


Jg) =at” +a t7 +--+, t+ ay 
can be factored into 
g(t) = &(t—m)(E-m)--- (t-u) 
then the polynomial (26.11) can be factored into 
g(A) = a, (A -nDA -7D + (A-myD) (26.12) 


The order of the factors in (26.12) is not important since, as a result of (26.10), the factors 
commute. Notice, however, that the product of two nonzero endomorphisms can be zero. Thus the 
formula 
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gı(A)g, (A) =0 (26.13) 


for two polynomials g, and g, generally does not imply one of the factors is zero. For example, 
any projection P satisfies the equation P(P- I) =0, but generally P and P -I are both nonzero. 


Theorem 26.1. (Cayley-Hamilton). If f(4) is the characteristic polynomial (26.5) for an 
endomorphism A, then 


f(A) =A)" + u (A) +--+ uya (A) + fy = 0 (26.14) 


Proof. : The proof which we shall now present makes major use of equation (0.29) or, 
equivalently, (21.19). If adj(A—AI) is the endomorphism whose matrix is adj [ A’ 1740; | where 


[ A’ «| = M(A), then by (26.2) and (0.29) (see also Exercise 40.5) 


(adj(A —AD)(A-AD = f (A)I (26.15) 


By (21.18) it follows that adj(A — AI) is a polynomial of degree N —1 in 2. Therefore 


adj(A — AI) = B,(-2)"*+B,(-A)"* +---+B,_,(-A)+B,_, (26.16) 


where B,,...,B,_, are endomorphisms determined by A. If we now substitute (26.16) and (26.15) 
into (26.14) and require the result to hold for all 4, we find 


B, =I 
B,A+B, = 41 
BA+B, = WWI 


(26.17) 


B, A+ By. = Ayal 
By A= uyl 
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Now we multiply (26.17); . by (—A)” , (26.17)2 by (-A)"*, (26.17); by (-A)””,...,(26.17); by 
(-A)"“*, etc., and add the resulting N equations, to find 


(-A)" + w,(-A)" 1 +--+ iya (A) + Wy = 0 (26.18) 


which is the desired result. 


Exercises 


26.1 If N =dim~Y is odd, and the scalar field is # , prove that the characteristic equation of any 
Ace ¥(¥;V) has at least one eigenvalue. 


26.2 If N =dim~Y is odd and the scalar field is #, prove that if det A > 0(< 0), then A has at 
least one positive (negative) eigenvalue. 


26.3 If N =dimyY is even and the scalar field is # and det A <0, prove that A has at least one 
positive and one negative eigenvalue. 


26.4 Let Ac Y(¥;¥Y) be regular. Show that 


det(A“ — A™T) = (—A)™ det A~ det(A — AI) 


where N = dim% . 


26.5 Prove that the characteristic polynomial of a projection P: ¥ =>% is 


f(A)=(-Iy "Aa" -A) 


where N =dim¥Y and L=dimP(7%). 
26.6 If Ac ¥(Y;V) is regular, express A™ as a polynomial in A. 


26.7 Prove directly that the quantity x, defined by (26.6) is independent of the choice of basis 
for y. 


26.8 Let C be an endomorphism whose matrix relative to a basis {e,,...,e,} is a triangular 


matrix of the form 
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26.9 


26.10 


26.11 
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0 : 0 
0 
M(C)= eo (26.19) 
0 
0 1 
i.e. C maps ẹ to 0, e, toe,, e, to e,,.... Show that a necessary and sufficient condition 


for the existence of a component matrix of the form (26.19) for an endomorphism C is that 


cC“ =0 but C”'+0 (26.20) 


We call such an endomorphism C a nilcyclic endomorphism and we call the basis 
{e,,.. €y} for the form (26.19) a cyclic basis for C. 


If ¢,,...,gy denote the fundamental invariants of A, use the results of Exercise 26.4 to 
show that 


o, = (det AO ig 


for j=1...,N. 
It follows from (26.16) that 


B,_,=adjA 
Use this result along with (26.17) and show that 
adjA =(-A)"" + 4,(-A)"? +--+ pty 
Show that 


My_, = tr(adjA) 
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and, from Exercise (26.10), that 


1 
N-1 


Hy 5T {tr(-A)"" + 44 tr(—-A)*” ++ fy _» t(-A)} 
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Section 27. Spectral Decomposition for Hermitian Endomorphisms 


An important problem in mathematical physics is to find a basis for a vector space YW in 
which the matrix of a given Ac Y(%⁄;% ) is diagonal. If we examine (25.1), we see that if ⁄ has 
a basis of eigenvectors of A, then M (A) is diagonal and vice versa; further, in this case the 
diagonal elements of M (A) are the eigenvalues of A. As we shall see, not all endomorphisms 
have matrices which are diagonal. Rather than consider this problem in general, we specialize here 
to the case where A,.then M (A) is Hermitian and show that every Hermitian endomorphism has a 


matrix which takes on the diagonal form. The general case will be treated later, in Section 30. 
First, we prove a general theorem for arbitrary endomorphisms about the linear independence of 
eigenvectors. 


Theorem 27.1. If 1,...,4, are distinct eigenvalues of Ac Y(%⁄;%) and if u,,...,u, are 


eigenvectors corresponding to them, then {u,,...,U a form a linearly independent set. 


Proof. If {u,,..., u it is not linearly independent, we choose a maximal, linearly independent 
subset, say {u,,...,u,}, from the set {u,,...,u,}; then the remaining vectors can be expressed 


uniquely as linear combinations of {u,,...,u,}, say 


Usa = QU ++ au, (27.1) 


where @,,...,@, are not all zero and unique, because {u,,...,U,} is linearly independent. Applying 
A to (27.1) yields 


A 


S+1 


Us = (aA Ju TRAE (aA, )u, (27.2) 


Now if 4 


su = 0, then 4,,...,2, are nonzero because the eigenvalues are distinct and (27.2) 


contradicts the linear independence of {u,,...,u,}; on the other hand, if 2,,, #0, then we can 
divide (27.2) by 4, 


contradicting the uniqueness of the coefficients @,,...,@,. Hence in any case the maximal linearly 


obtaining another expression of u,,, as a linear combination of {u,,...,u,} 


+1? S41 


independent subset cannot be a proper subset of {u,,...,u,}; thus {u,,...,u,} is linearly 


independent. 
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As a corollary to the preceding theorem, we see that if the geometric multiplicity is equal to 


the algebraic multiplicity for each eigenvalue of A, then the vector space Y admits the direct sum 
representation 


Y =V (4)Ð Y (4)®: -DV (4) 


where 4,,...,2, are the distinct eigenvalues of A. The reason for this representation is obvious, 
since the right-hand side of the above equation is a subspace having the same dimension as ¥ ; 
thus that subspace is equal to Y . Whenever the representation holds, we can always choose a 
basis of Y formed by bases of the subspaces ¥(A,),....VY(A,). Then this basis consists entirely of 
eigenvectors of A becomes a diagonal matrix, namely, 


4 g 


M(A)= (27.3) 


L Ay | 


where each 4, is repeated d, times, d, being the algebraic as well as the geometric multiplicity of 
Ax. Of course, the representation of Y by direct sum of eigenspaces of A is possible if A has 


N =dimyY distinct eigenvalues. In this case the matrix of A taken with respect to a basis of 
eigenvectors has the diagonal form 
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M(A)= - (27.3)2 


Ay 


If the eigenvalues of v are not all distinct, then in general the geometric multiplicity of an 
eigenvalue may be less than the algebraic multiplicity. Whenever the two multiplicities are 
different for at least one eigenvalue of A, it is no longer possible to find any basis in which the 
matrix of A is diagonal. However, if W is an inner product space and if A is Hermitian, then a 
diagonal matrix of A can always be found; we shall now investigate this problem. 


Recall that if u and v are arbitrary vectors in Y,, the adjoint AX of Ac L(V;V) is 
defined by 


Au-v =u-A’v (27.4) 


As usual, if the matrix of A is referred to a basis {e,}, then the matrix of A’ is referred to the 


reciprocal basis fe") and is given by [cf. (18.4)] 


M(A*)=M(A)" (27.5) 


where the overbar indicates the complex conjugate as usual. If A Hermitian, i.e., if A = A* , then 
(27.4) reduces to 


Au- v =u: Av (27.6) 


for all uve% . 
Theorem 27.2. The eigenvalues of a Hermitian endomorphism are all real. 


Proof. Let Ac 4(¥;¥%) be Hermitian. Since Au = Au for any eigenvalue 2, we have 
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(27.7) 


Therefore we must show that Au -u is real or, equivalently, we must show Au -u = Au -u. By 
(27.6) 


Au-u =u: Au = Auu (27.8) 
where the rule u-v = v-u has been used. Equation (27.8) yields the desired result. 


Theorem 27.3. If Ais Hermitian and if % is an A -invariant subspace of Y , then %~ is also A - 
invariant . 


Proof. If ve% and ue %~, then Av-u=0 because Ave % . But since A is Hermitian, 


Av-u=v-Au=0. Therefore, Au e ¥%~, which proves the theorem. 


Theorem 27.4. If A is Hermitian, the algebraic multiplicity of each eigenvalue equals the 
geometric multiplicity. 


Proof Let W(A,) be the characteristic subspace associated with an eigenvalue /,. Then the 
geometric multiplicity of 2, is M =dim¥(A,). By Theorems 13.4 and 27.3 


V -V(A)OV(A)Y (27.9) 
Where ¥(/,) and ⁄ (4 )* are A -invariant. By Theorem 24.1 
A=A,+A, 
and by Theorem 17.4 


I=P, +P, 
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where P, projects Y onto Y(A,), P, projects Y onto ⁄ (4), A, =AP,, and A, = AP,. By 


Theorem 18.10, P, and P, are Hermitian and they also commute with A. Indeed, for any veY, 
Pv e Y (A) and P,ve¥(/,), and thus AP v € ¥ (4) and AP,ve¥(A,)°. But since 


Av = A(P, + P,)v = AP v + AP,v 


we see that AP v is the ⁄(4,) component of Av and AP,vis the ⁄ (4 )* component of Av. 
Therefore 


P Av = APv, P,Av = AP,v 
for all ve ¥, or, equivalently 
PA=AP,, P,A = AP, 


Together with the fact that P,, and P, are Hermitian, these equations imply that A, and A, are 
also Hermitian. Further, A is reduced by the subspaces ¥(A,) and Y(A,)~, since 


A,A, = AP, AP, = A’PP, 


Thus if we select a basis {e,,...,e,} such that {e,,...,e,} span ¥(A,) and {e,,,,,...€,} span 


Y (Ay)~ , then by the result of Exercise 24.1 the matrix of A to {e,} takes the form 


M(A)= 
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and the matrices of A, and A, are 


0 
M(A,)= 
0 0 
L ] 
0 0 
M(A,)= (27.10) 


which imply 


det(A — AT) = det(A; - AL, ,,,)) det(A, — AI 


ray") 


By (25.2), A, = A,L,,,,; thus by (21.21) 


¥(A)? 
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det(A — AI) = (4, — A)" det(A, -AI 


voy) 


On the other hand, 4, is not an eigenvalue of A,. Therefore 


det(A,-AI, .) #0 


Y (A) 


Hence the algebraic multiplicity of 2, equals M, the geometric multiplicity. 


The preceding theorem implies immediately the important result that YW can be represented 
by a direct sum of eigenspaces of A if A is Hermitian. In particular, there exists a basis 
consisting entirely in eigenvectors of A, and the matrix of A relative to this basis is diagonal. 
However, before we state this result formally as a theorem, we first strengthen the result of 
Theorem 27.1 for the special case that A is Hermitian. 


Theorem 27.5. If A is Hermitian, the eigenspaces corresponding to distinct eigenvalues 2, and 
A, are orthogonal. 


Proof. Let Au, = 4u, and Au, = 7,u,. Then 
Au u, = Au, u, = u Au, = Au, -u, 
Since 4, #A,, u,-u, = 0, which proves the theorem. 
The main theorem regarding Hermitian endomorphisms is the following. 


Theorem 27.6. If A is a Hermitian endomorphism with (distinct) eigenvalues 4,, /,,...,4, , then 
y has the representation 


V =V(A)@V (A) ®- OV (A) (27.11) 


where the eigenspaces ¥(A,) are mutually orthogonal. 
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The proof of this theorem is a direct consequence of Theorems 27.4 and 27.5 and the 
remark following the proof of Theorem 27.1. 


Corollary (Spectral Theorem). If A is a Hermitian endomorphish with (distinct) eigenvalues 
Ais- ^A;, then 


A=)'AP, (27.12) 


j= 


where P, is the perpendicular projection of Y onto ¥(/,), for j =1,...,L. 


Proof. By Theorem 27.6, A has a representation of the form (24.1). Let u be an arbitrary element 
of Y , then, by (27.11), 


u= tee +u, (27.13) 


where u, €¥(A,). By (24.3), (27.13), and (25.1) 


Awu=Au,=Au,=Au, 
But u j= PM; therefore 


A, =4,P, (no sum) 


J J 


which, with (24.1) proves the corollary. 


The reader is reminded that the L perpendicular projections satisfy the equations 
L 
SP =I (27.14) 


P’ =P, (27.15) 
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P =P’ (27.16) 


and 


PP =0 iz j (27.17) 


J 


These equations follow from Theorems 14.4, 18.10 and 18.11. Certain other endomorphisms also 
have a spectral representation of the form (27.12); however, the projections are not perpendicular 
ones and do not obey the condition (27.17). 


Another Corollary of Theorem 27.6 is that if A is Hermitian, there exists an orthogonal 
basis for Y consisting entirely of eigenvectors of A. This corollary is clear because each 
eigenspace is orthogonal to the other and within each eigenspace an orthogonal basis can be 
selected. (Theorem 13.3 ensures that an orthonormal basis for each eigenspace can be found.) 
With respect to this basis of eigenvectors, the matrix of A is clearly diagonal. Thus the problem of 
finding a basis for ⁄ such that M(A) is diagonal is solved for Hermitian endomorphisms. 


If f(A) is any polynomial in the Hermitian endomorphism, then (27.12),(27.15) and 
(27.17) can be used to show that 


f(A)= > f (4P, (27.18) 


where f(A) is the same polynomial except that the variable A is replaced by the scalar 2. For 
example, the polynomial P’ has the representation 


N 
P=)? 
jel 


In general, f(A) is Hermitian if and only if f(/,) is real for all j =1,...,L. If the eigenvalues of 


A are all nonnegative, then we can extract Hermitian roots of A by the following rule: 


N 
AM = AP, (27.19) 


j=l 


where 4;* >0. Then we can verify easily that (a” i ) =A. If Ahas no zero eigenvalues, then 
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L 
A= Sip (27.20) 


which is easily confirmed. 


A Hermitian endomorphism A is defined to be 


positive definite >0 

positive semidefinite | | 20 
f TAn An if v- Av 

negative semidefinite < 


negative definite <0 


all nonzero v, It follows from (27.12) that 
v:-Av=9 4v; v) (27.21) 


where 


L 
v=), 


j=l 


Equation (27.21) implies the following important theorem. 


Theorem 27.7 A Hermitian endomorphism A is 
positive definite 
positive semidefinite 
negative semidefinite 


negative definite 


if and only if every eigenvalue of A is 
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<0 


As corollaries to Theorem 27.7 it follows that positive-definite and negative-definite 
Hermitian endomorphisms are regular [see (27.20)], and positive-definite and positive-semidefinite 
endomorphisms possess Hermitian roots. 


Every complex number z has a polar representation in the form z = re” , where r20. It 
turns out that an endomorphism also has polar decompositions, and we shall deduce an important 
special case here. 


Theorem 27.8. (Polar Decomposition Theorem). Every automorphism A has two unique 
multiplicative decompositions 


A=RU and A=VR (27.22) 


where R is unitary and U and V are Hermitian and positive definite. 


Proof. : For each automorphism A, A*A is a positive-definite Hermitian endomorphism, and 
hence it has a spectral decomposition of the form 


L 
A*A =}, (27.23) 
j=1 


where A, >0,j =1...,L. We define U by 
* a \i/2 L 1/2 
U=(A'A) = APP, (27.24) 
j=l 


Clearly U is Hermitian and positive definite. Since U is positive definite it is regular. We now 
define 
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R=AU' (27.25) 
By this formula R is regular and satisfies 
RR=U'AAU'=U'UU'=I (27.26) 
Therefore R is unitary. To prove the uniqueness, assume 
RU =RU, (27.27) 


and we shall prove R=R, and U=U,. From (27.27) 


U? = UR'RU = (RU)'RU =(R,U,)R.U, = U? 


Since the positive-definite Hermitian square root of U? is unique, we find U =U,. Then (27.27) 
implies R=R,. The decomposition (27.22), follows either by defining V° = AA’, or, 
equivalently, by defining V = RUR“ and repeating the above argument. 


A more general theorem to the last one is true even if A is not required to be regular. 
However, in this case R is not unique. 


Before closing this section we mention again that if the scalar field is real, then a Hermitian 
endomorphism is symmetric, and a unitary endomorphism is orthogonal. The reader may rephrase 
the theorems in this and other sections for the real case according to this simple rule. 


Exercises 


27.1 Show that Theorem 27.5 remains valid if A is unitary or skew-Hermitian rather than 
Hermitian. 


27.2 If Ac Y(¥;V) is Hermitian, show that A is positive semidefinite if and only if the 
fundamental invariants of A are nonnegative. 


27.3 Given an endomorphism A, the exponential of A is an endomorphism expA defined by 
the series 
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expA = yA! (27.28) 


It is possible to show that this series converges in a definite sense for all A. We shall 
examine this question in Section 65. Show that if Ac Y(V;V) is a Hermitian 
endomorphism given by the representation (27.12), then 


L 
expA =) e”P, (27.29) 
j=l 
This result shows that the series representation of expA is consistent with (27.18). 


27.4 Suppose that A is a positive-definite Hermitian endomorphism; give a definition for log A 


by power series and one by a formula similar to (27.18), then show that the two definitions 
are consistent. Further, prove that 


logexpA=A 


27.5 Forany Ac Y(¥;¥) show that A*A and AA’ are Hermitian and positive semidefinite. 
Also, show that A is regular if and only if A*A and AA‘ are positive definite. 


27.6 Show that A‘ = B* where k is a positive integer, generally does not imply A =B even if 
both A and B are Hermitian. 
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Section 28. Illustrative Examples 


In this section we shall illustrate certain of the results of the preceding section by working 
selected numerical examples. For simplicity, the basis of Y shall be taken to be the orthonormal 


basis {i,,...,i, } introduced in (13.1). The vector equation (25.1) takes the component form 


iV; EA, (28.1) 
where 
v=vi, (28.2) 
and 
Ai, = Aji, (28.3) 


Since the basis is orthonormal, we have written all indices as subscripts, and the summation 
convention is applied in the usual way. 


Example 1. Consider a real three-dimensional vector space ¥Y . Let the matrix of an 
endomorphism Ac Y(¥;V) be 


M(A)= (28.4) 


oO re re 
PN e 
=. e o 


Clearly A is symmetric and, thus, the theorems of the preceding section can be applied. By direct 
expansion of the determinant of M (A — AI) the characteristic polynomial is 


f(A) =(-4)G-4)3- 4) (28.5) 


Therefore the three eigenvalues of A are distinct and are given by 
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A,=0, A, =1, 4,=3 (28.6) 


The ordering of the eigenvalues is not important. Since the eigenvalues are distinct, their 
corresponding characteristic subspaces are one-dimensional. For definiteness, let v‘” be an 
eigenvector associated with 2,. As usual, we can represent v” by 


v® = yi (28.7) 
Then (28.1), for p =1, reduces to 
vO +v® =0, VO +29 +v® =0, vO +v =0 (28.8) 
The general solution of this linear system is 
v? =t, v? =t, vw =t (28.9) 


for all te 2. In particular, if v™ is required to be a unit vector, then we can choose t = +1/ 35 
where the choice of sign is arbitrary, say 


v? = 1/43, WO =-1/43, vP = (28.10) 


So 
v = (1/V3)i, - (/V3)i, + 0/vV3)i, (28.11) 


Likewise we find for p = 2 


v = a2, -A/V2)i, (28.12) 


and for p=3 


v® = (1/vV6)i, + (2/V6)i, + (1/0), (28.13) 
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It is easy to check that {v,v, v4 is an orthonormal basis. 


By (28.6) and (27.12), A has the spectral decomposition 


A =0P, +1P, +3P, (28.14) 


where P, is the perpendicular projection defined by 


Pv =v, Py =0, jk (28.15) 


for k =1,2,3. In component form relative to the original orthonormal basis {i,,i,,i,} these 


projections are given by 


Picy vri (no sum on k) (28.16) 


l 


This result follows from (28.15) and the transformation law (22.7) or directly from the 
representations 


i, = 1/V3)v® + (1/2)v + (1/J6)v 
i, = -(/V3)v + (2/V6)v (28.17) 
i, = /V3)v - /2)v + G/V6)v 


since the coefficient matrix of {i,,i,,i,} relative to {v®, vy), vt is the transpose of that of 


{v®,v, v }relative to {i,,i,,4,}. 


There is a result, known as Sylvester’s Theorem, which enables one to compute the 
projections directly. We shall not prove this theorem here, but we shall state the formula in the 
case when the eigenvalues are distinct. The result is 
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= ia jer(Al a A) 
: ere F A;) 


(28.18) 


The advantage of this formula is that one does not need to know the eigenvectors in order to find 
the projections. With respect to an arbitrary basis, (28.18) yields 


r-r jek (A.M (1) = M(A)) 


M(P.)= 
i raiik (A, z A.) 


(28.19) 


Example 2. To illustrate (28.19), let 


J- 9 
M(A)= f A (28.20) 


with respect to an orthonormal basis. The eigenvalues of this matrix are easily found to be 
A, =—2,A, =3. Then, (28.19) yields 


and 


O 1 0 2. 2 A 1/4 2 
wo-(af J Nfa- 3 


The spectral theorem for this linear transformation yields the matrix equation 
1 -2 4 2 
5|-2 4] 5/2 1 


Exercises 
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28.1 The matrix of Ac ¥(¥;VW) with respect to an orthonormal basis is 
M(A)= 2 1 
Tja 


(a) Express M (A) in its spectral form. 
(b) Express M (A™) in its spectral form. 


(c) Find the matrix of the square root of A. 


28.2 The matrix of Ac Y(¥;¥Y) with respect to an orthonormal basis is 


M(A) = 


Pe oO W 
O N O 
WwW on 


Find an orthonormal basis for Y relative to which the matrix of A is diagonal. 


28.3 If Ac Y(¥;V) is defined by 
Ai, = 2V2i, —2i, 


Ai, = (5 V2 +2), +(8-SVD)i 


and 


where {i,,i,,i,} is an orthonormal basis for Y , determine the linear transformations R , 


U, and V in the polar decomposition theorem. 
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Section 29 The Minimal Polynomial 


In Section 26 we remarked that for any given endomorphism A there exist some 
polynomials f(t) such that 


f(A)=0 (29.1) 


For example, by the Cayley-Hamilton theorem, we can always choose, f(t) to be the characteristic 
polynomial of A. Another obvious choice can be found by observing the fact that since “(V; V) 


has dimension N’, the set 
fa’ 2 LA! A,A") 


is necessarily linearly dependent and, thus there exist scalars fa, isin, oer , hot all equal to zero, 


such that 


al+aA'+--+a,A™ =0 (29.2) 


For definiteness, let us denote the set of all polynomials f satisfying the condition (29.1) fora 
given A by the symbol #(A). We shall now show that A(A) has a very simple structure, called 
a principal ideal. 


In general, an ideal ¥ is a subset of an integral domain 2 such that the following two 
conditions are satisfied: 


1. If f and g belong to ¥,so is their sum f +g. 
2; If f belongs to ¥ and h arbitrary element of 9, then fh=hf also belong to ¥. 


We recall that the definition of an integral domain is given in Section 7. Of course, 2 itself 
and the subset {0} consisting in the zero element of 2 are obvious examples of ideals, and these 


are called trivial ideals or improper ideals. Another example of ideal is the subset % c 9 
consisting in all multiples of a particular element g € 9, namely 
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JI ={hg,he 2} (29.3) 


It is easy to verily that this subset satisfies the two conditions for an ideal. Ideals of the special 
form (29.3) are called principal ideals. 


For the set P(A) we choose the integral domain 2 to be the set of all polynomials with 
complex coefficients. (For real vector space, the coefficients are required to be real, of course.) 
Then it is obvious that P(A) is an ideal in 9, since if f and g satisfy the condition (29.1), so 
does their sum f +g and similarly if f satisfies (29.1) and h is an arbitrary polynomial, then 


(hf (A) = h(A) f (A) = h(A)0 = 0 (29.4) 


The fact that P(A) is actually a principal ideal is a standard result in algebra, since we have the 
following theorem. 


Theorem 29.1. Every ideal of the polynomial domain is a principal ideal. 


Proof. We assume that the reader is similiar with the operation of division for polynomials. If f 
and g #0 are polynomials, we can divide f by g and obtain a remainder r having degree less 
than g, namely 


r(t)= fO-h(g® (29.5) 


Now, to prove that A(A) can be represented by the form (29.3), we choose a polynomial g #0 
having the lowest degree in A(A). Then we claim that 


P(A) ={hg,he 2} (29.6) 


To see this, we must show that every f € A(A) can be devided through by g without a remainder. 
Suppose that the division of f by g yields a remainder r as shown in (29.5). Then since P(A) 

is an ideal and since f,g € A(A), (29.5) shows that r € P(A) also. But since the degree of r is 
less than the degree of g,r € A(A) is possible if and only if r=0. Thus f =hg, so the 
representation (29.6) is valid. 
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A corollary of the preceding theorem is the fact that the nonzero polynomial g having the 
lowest degree in A(A) is unique to within an arbitrary nonzero multiple of a scalar. If we require 
the leading coefficient of g to be 1, then g becomes unique, and we call this particular 
polynomial g the minimal polynomial of the endomorphism A. 


We pause here to give some examples of minimal polynomials. 


Example 1. The minimal polynomial of the zero endomorphism 0 is the polynomial f(t) =1 of 
zero degree, since by convention 


f (0) =10° =0 (29.7) 


In general, if A #0, then the minimal polynomial of A is at least of degree 1, since in this case 


1A° =11 40 (29.8) 


Example 2. Let P bea nontrivial projection. Then the minimal polynomial g of P is 


g(t) =ť -t (29.9) 


For, by the definition of a projection, 


P?-P=P(P-I)=0 (29.10) 


and since P is assumed to be non trivial, the two lower degree divisors t and t—1 no longer 
satisfy the condition (29.1) for P. 


Example 3. For the endomorphism C whose matrix is given by (26.19) in Exercise 26.8, the 
minimal polynomial g is 


g(t)=t” (29.11) 


since we have seen in that exercise that 
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cC =0 but C™7#0 


For the proof of some theorems in the next section we need several other standard results in 
the algebra of polynomials. We summarize these results here. 


Theorem 29.2. If f and g are polynomials, then there exists a greatest common divisor d which 
is a divisor (i.e., a factor) of f and g and is also a multiple of every common divisor of f and g. 


Proof. We define the ideal ¥ in the polynomial domain 2 by 
I = hf +kg, h,k € 2} (29.12) 
By Theorem 29.1, ¥ is a principal ideal, and thus it has a representation 
J ={hd, he 2} (29.13) 


We claim that d is a greatest common divisor of f and g. Clearly, d is a common divisor of f 
and g,since f and g are themselves members of ¥ , so by (29.13) there exist h and k in 9 
such that 


f =hd, g =kd (29.14) 


On the other hand, since d is also a member of ¥, by (29.12) there exist also p and q in 9 such 
that 


d= pf +qg (29.15) 


Therefore if c is any common divisor of f and g, say 


f =ac, g =bc (29.16) 


then from (29.15) 


d =(pa+qb)c (29.17) 
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so d is a multiple of c. Thus d is a greatest common divisor of f and g. 


By the same argument as before, we see that the greatest common divisor d of f and g is 


unique to within a nonzero scalar factor. So we can render d unique by requiring its leading 
coefficient to be 1. Also, it is clear that the preceding theorem can be extended in an obvious way 
to more than two polynomials. If the greatest common divisor of f,,..., f, is the zero degree 


polynomial 1, then f,,..., f, are said to be relatively prime. Similarly f,,..., f, are pairwise prime 
if each pair f., f, i+ j, from fi... f, is relatively prime. 


Another important concept associated with the algebra of polynomials is the concept of the 
least common multiple. 


Theorem 29.3. If f and g are polynomials, then there exists a least common multiple m 
which is a multiple of f and g and is a divisor of every common multiple of f and g. 


The proof of this theorem is based on the same argument as the proof of the preceding 
theorem, so it is left as an exercise. 


Exercises 


29.1 If the eigenvalues of an endomorphism A are all single roots of the characteristic equation, 
show that the characteristic polynomial of A is also a minimal polynomial of A. 


29.2 Prove Theorem 29.3. 


29.3 If f and g are nonzero polynomials and if d is their greatest common divisor, show that 
then 
m= fg/d 


is their least common multiple and, conversely, if m is their least common multiplies, then 


d= fg/m 
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is their greatest common divisor. 
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Section 30. Spectral Decomposition for Arbitrary Endomorphisms 


As we have mentioned in Section 27, not all endomorphism have matrices which are 
diagonal. However, for Hermitian endomorphisms, a decomposition of the endomorphism into a 
linear combination of projections is possible and is given by (27.12). In this section we shall 
consider the problem in general and we shall find decompositions which are, in some sense, closest 
to the simple decomposition (27.12) for endomorphisms in general. 


We shall prove some preliminary theorems first. In Section 24 we remarked that the null 
space K(A) is always A -invariant. This result can be generalized to the following 


Theorem 30.1. If. f is any polynomial, then the null space K(f (A)) is A -invariant. 


Proof. Since the multiplication of polynomials is a commutative operation, we have 
Af (A) = f(A)A (30.1) 
Hence if ve K(f (A)), then 
f (A)Av = Af (A)v = A0 = 0 (30.2) 


which shows that Av e K(f(A)). Therefore K(f(A)) is A -invariant. 


Next we prove some theorems which describe the dependence of K(f (A)) on the choice of 


Theorem 30.2. If f is a multiple of g , say 
f =hg (30.3) 
Then 


K(f(A)) > K(g(A)) (30.4) 
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Proof. This result is a general property of the null space. Indeed, for any endomorphisms B and 
C we always have 


K(BC) c K(C) (30.5) 


So if we set h(A)=B and g(A) =C, then (30.5) reduces to (30.4). 


The preceding theorem does not imply that K(g(A)) is necessarily a proper subspace of 
K(f (A)), however. It is quite possible that the two subspaces, in fact, coincide. For example, if 
g and hence f both belong to A(A), then g(A) = f(A) =0, and thus 


K(f(A)) = K(g(A)) = K(0)= 7 (30.6) 


However, if m is the minimal polynomial of A, andif f is a proper divisor of m (i.e., m is nota 
divisor of f ) so that f ¢ P(A), then K(f (A)) is strictly a proper subspace of Y = K(m(A)). 
We can strengthen this result to the following 


Theorem 30.3. If f is a divisor (proper or improper) of the minimal polynomial m of A, and if 
g is a proper divisor of f , then K(g(A)) is strictly a proper subspace of K(f (A)). 


Proof. By assumption there exists a polynomial h such that 


m = hf (30.7) 


We set 


k =hg (30.8) 


Then k is a proper divisor of m, since by assumption, g is a proper divisor of f . By the remark 
preceding the theorem, K(k(A)) is strictly a subspace of WY , which is equal to K(m(A)). Thus 
there exists a vector v such that 


k(A)v = g(A)h(A)v +0 (30.9) 
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which implies that the vector u = h(A)v does not belong to K(g(A)). On the other hand, from 
(30.7) 


f(A)u = f(A)h(A)v = m(A)v =0 (30.10) 


which implies that u belongs to K( f(A)). Thus K(g(A)) is strictly a proper subspace of 
K(f(A)). 


The next theorem shows the role of the greatest common divisor in terms of the null space. 


Theorem 30.4. Let f and g be any polynomials, and suppose that d is their greatest common 
divisor. Then 


K(d(A)) = K( f(A) 0 K(g(A)) (30.11) 


Obviously, this result can be generalized for more than two polynomials. 


Proof. Since d is a common divisor of f and g, the inclusion 


K(d(A)) € K( F(A) 0 K(g(A)) (30.12) 


follows readily from Theorem 30.2. To prove the reversed inclusion, 


K(d(A)) > K(f(A)) 0 K(g(A)) (30.13) 


recall that from (29.15) there exist polynomials p and q such that 


d = pf +qg (30.14) 


and thus 


d(A) = p(A) f(A) + q(A)g(A) (30.15) 


This equation means that if v € K(f (A)) a K(g(A)), then 
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d(A)v = p(A) f(A)v + q(A)g(A)v 


= p(A)0+ q(A)0=0 ee 


so that v € K(d(A)). Therefore (30.13) is valid and hence (30.11). 


A corollary of the preceding theorem is the fact that if f and g are relatively prime then 


K(f(A)) ^A K(g(A)) = {0} (30.17) 
since in this case the greatest common divisor of f and g is d(t) =1, so that 
K(d(A)) = K(A°) = K(I) = {0} (30.18) 
Here we have assumed A # 0 of course. 
Next we consider the role of the least common multiple in terms of the null space. 


Theorem 30.5. Let f and g be any polynomials, and suppose that / is their least common 
multiplier. Then 


K(I(A)) = K(f(A)) + K(g(A)) (30.19) 


where the operation on the right-hand side of (30.19) is the sum of subspaces defined in Section 10. 
Like the result (30.11), the result (30.19) can be generalized in an obvious way for more than two 
polynomials. 


The proof of this theorem is based on the same argument as the proof of the preceding 
theorem, so it is left as an exercise. As in the preceding theorem, a corollary of this theorem is that 
if f and g are relatively prime (pairwise prime if there are more than two polynomials) then 


(30.19) can be strengthened to 


K(I(A)) = K(f(A)) ® K(g(A)) (30.20) 
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Again, we have assumed that A #0. We leave the proof of (30.20) also as an exercise. 


Having summarized the preliminary theorems, we are now ready to state the main theorem 
of this section. 


Theorem 30.6. If m is the minimal polynomial of A which is factored into the form 


m(t) =(t—A,)" ---(¢-A,)* =m (t) m, (t) (30.21) 


where 4,,...,2, are distinct and a,,...,a, are positive integers, then Y has the representation 


Y = K(m(A)) ®---@ K(m,(A)) (30.22) 


Proof. Since 4,,...,4, are distinct, the polynomials m,,...,m, are pairwise prime and their least 
common multiplier is m. Hence by (30.20) we have 


K(m(A)) = K(m,(A)) © ---® K(m, (A)) (30.23) 


But since m(A) =0, K(m(A)) =% , so that (30.22) holds. 


Now from Theorem 30.1 we know that each subspace K(m,(A)),.i=1,...,L is A -invariant; 
then from (30.22) we see that A is reduced by the subspaces K(m,(A)),....K(m,(A)). k{ ,{ ))) 


.1 kfmL(6)). Therefore, the results of Theorem 24.1 can be applied. In particular, if we choose a 
basis for each K(m,(A)) and form a basis for Y by the union of these bases, then the matrix of A 


takes the form (24.5) in Exercise 24.1. The next theorem shows that, in some sense, the 
factorization (30.21) gives also the minimal polynomials of the restrictions of A to the various A - 
invariant subspaces from the representation (30.22). 


Theorem 30.7. Each factor m, of m is the minimal polynomial of the restriction of A to the 
subspace K(m,(A)). More generally, any product of factors, say m,-:-m,, is the minimal 
polynomial of the restriction of A to the corresponding subspace K(m,(A)) ®---® K(m,,(A)) 
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Proof : We prove the special case of one factor only, say m,; the proof of the general case of 


several factors is similar and is left as an exercise. For definiteness, let A denote the restriction of 
A to K(m(A)) Then m,(A) is equal to the restriction of m,(A) to K(m,(A)), and, thus 


m,(A) = 0, which means that m, € P(A). Now if g is any proper divisor of m,, then g(A) #0; 
for otherwise, we would have g(A)m,(A)---m,(A), contradicting the fact that m is minimal for 


A. Therefore m, is minimal for A and the proof is complete. 


The form (24.5), generally, is not diagonal. However, if the minimal polynomial (30.21) 
has simple roots only, i.e., the powers a,,...,a, are all equal to 1, then 


m(A)=A-Al (30.24) 


for all i. In this case, the restriction of A to K(m,(A)) coincides with 4I on that subspace, 
namely 


Aya =41 (30.25) 
Here we have used the fact from (30.24), that 
¥ (A) = K(m(A)) (30.26) 


Then the form (24.5) reduces to the diagonal form (27.3); or (27.3)p. 


The condition that m has simple roots only turns out to be necessary for the existence of a 
diagonal matrix for A also, as we shall now see in the following theorem. 


Theorem 30.8. An endomorphism A has a diagonal matrix if and only if its minimal polynomial 
can be factored into distinct factors all of the first degree. 


Proof. Sufficiencyhas already been proven. To prove necessity, assume that the matrix of A 
relative to some basis {e,,...,e,} has the form (27.3);. Then the polynomial 


m(t) =(t-4) = (t-4,) (30.27) 
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is the minimal polynomial of A. Indeed, each basis vector e, is contained in the null space 
K(A - 4,1) for one particular k. Consequently e, € K(m(A)) for all i =1,..., N , and thus 


K(m(A)) =% (30.28) 


or equivalently 
m(A)=0 (30.29) 


which implies that m € P(A). But since 4,,,,4, are distinct, no proper divisor of m still belongs 
to P(A). Therefore m is the minimal polynomial of A. 


As before, when the condition of the preceding theorem is satisfied, then we can define 
projections P,,...,P, by 


R(P) =V%(A), K(P)= S ¥(A,) (30.30) 


jzi 


for all i =1,..., L , and the diagonal form (27.3); shows that A has the representation 


L 
A=AP,+--+A,P, = AP (30.31) 
i=1 


It should be noted, however, that in stating this result we have not made use of any inner product, 
so it is not meaningful to say whether or not the projections P,,...,P, are perpendicular; further, the 
eigenvalues 4,,...,4, are generally complex numbers. In fact, the factorization (30.21) for m in 
general is possible only if the scalar field is algebraically closed, such as the complex field used 
here. If the scalar field is the real field, we should define the factors m,,...,.m, of m to be powers 
of irreducible polynomials, i.e., polynomials having no proper divisors. Then the decomposition 
(30.22) for Y remains valid, since the argument of the proof is based entirely on the fact that the 
factors of m are pairwise prime. 


Theorem 30.8 shows that in order to know whether or not A has a diagonal form, we must 
know the roots and their multiplicities in the minimal polynomial m of A. Now since the 
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characteristic polynomial f of A belongs to A(A), m is a divisor of f . Hence the roots of m 
are always roots of f , The next theorem gives the converse of this result. 


Theorem 30.9. Each eigenvalue of A is a root of the minimal polynomial m of A and vice 
versa. 


Proof. Sufficiency has already been proved. To prove necessity, let 4 be an eigenvalue of A. 
Then we wish to show that the polynomial 


g(t) =t-A (30.32) 


is a divisor of m. Since g is of the first degree, if g is not a divisor of m, then m and g are 
relatively prime. By (30.17) and the fact that m € P(A), we have 


{0} = K(m(A)) A K(g(A)) =% A K(g(A)) = K(g(A)) (30.33) 


But this is impossible, since K(g(A)), being the characteristic subspace corresponding to the 
eigenvalue 2, cannot be of zero dimension. 


The preceding theorem justifies our using the same notations 4,,...,2, for the (distinct) 


roots of the minimal polynomial m, as shown in (30.21). However it should be noted that the root 
A, generally has a smaller multiplicity in m than in f , because m isa divisor of f . The 


characteristic polynomial f yields not only the (distinct) roots 4,,...,4, of m, it determines also 
the dimensions of their corresponding subspaces K(m,(A)),...,K(m,(A)) in the decomposition 
(30.22) This result is made explicit in the following theorem. 


Theorem 30.10. Let d, denote the algebraic multiplicity of the eigenvalue 2, as before; i.e., d, is 
the multiplicity of 4, in f [cf. (26.8)]. Then we have 


dim K(m,(A)) =d, (30.34) 


Proof We prove this result by induction. Clearly, it is valid for all A having a diagonal form, 
since in this case m,(A) = A — 41, so that K(m,(A)) is the characteristic subspace corresponding 
to A, and its dimension is the geometric multiplicity as well as the algebraic multiplicity of 2, . 


Now assuming that the result is valid for all A whose minimal polynomial has at most M multiple 
roots, where M =0 is the starting induction hypothesis, we wish to show that the same holds for 
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all A whose minimal polynomial has M +1 multiple roots. To see this, we make use of the 
decomposition (30.23) and, for definiteness, we assume that 4, is a multiple root of m. We put 


Y = K(m,(A))®---® K(m,(A)) (30.35) 


Then & is A =invariant. From Theorem 30.7 we know that the minimal polynomial m, of the 


restriction A, is 


m, =m,- m; (30.36) 


which has at most M multiple roots. Hence by the induction hypothesis we have 


dim X = dim K(m,(A))+---+ dim K(m, (A)) =d, +- +d, (30.37) 


But from (26.8) and (30.23) we have also 


N=d,+d,+-:-+d, 


: ; ; (30.38) 
N = dim K(m,(A)) + dim K(m,(A)) +---+ dim K(m, (A)) 
Comparing (30.38) with (30.37), we see that 
d, = dim K(m,(A)) (30.39) 


Thus the result (30.34) is valid for all A whose minimal polynomial has M +1 multiple roots, and 
hence in general. 


An immediate consequence of the preceding theorem is the following. 


Theorem 30.11. Let b, be the geometric multiplicity of 2, , namely 


b, = dim ¥ (4,) = dim K(g, (A)) = dim K(A - 4,1) (30.40) 


Then we have 


Sec. 30 e Arbitrary Endomorphisms 191 


1<b,<d,-a,+1 (30.41) 


where d, is the algebraic multiplicity of 2, and a, is the multiplicity of 2, in m,as shown in 
(30.21). Further, b, =d, if and only if 


K(g,(A)) = K(g,(A)) (30.42) 


Proof. If A, is a simple root of m,i.e., a, =1 and g, =m,, then from (30.34) and (30.40) we 
have b, =d,. On the other hand, if 4, is a multiple root of m , i.e., a, >1 and m, = g, , then the 


(a,-1) 


polynomials g,,9;5-.-,g| are proper divisors of m,. Hence by Theorem 30.3 


Y (A,) = K(g,(A)) € K(g;(A)) € -c K(gi* ?(A)) c K(m, (A)) (30.43) 


where the inclusions are strictly proper and the dimensions of the subspaces change by at least one 
in each inclusion. Thus (30.41) holds. 


The second part of the theorem can be proved as follows: If 2, is a simple root of m, then 
m, = g,, and, thus K(g,(A)) = K(g;(A)) = ¥(4,). On the other hand, if 4, is not a simple root 
of m , then m, is at least of second degree. In this case g, and gf are both divisors of m. But 
since g, is also a proper divisor of g; , by Theorem 30.3, K(g,(A)) is strictly a proper subspace 
of K(g;(A)), so that (30.42) cannot hold, and the proof is complete. 


The preceding three theorems show that for each eigenvalue 4, of A, generally there are 
two nonzero A -invariant subspaces, namely, the eigenspace ¥(4,) and the subspace K(m,(A)). 
For definiteness, let us call the latter subspace the characteristic subspace corresponding to A, and 
denote it ‘by the more compact notation %(4,). Then ¥(4,) is a subspace of %(4,) in general, 
and the two subspaces coincide if and only if 4, is a simple root of m. Since 4, is the only 
eigenvalue of the restriction of A to Y(A,), by the Cayley-Hamilton theorem we have also 


U(A,) = K(A-4,1)“) (30.44) 
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where d, is the algebraic multiplicity of 4, , which is also the dimension of %«(4,). Thus we can 
determine the characteristic subspace directly from the characteristic polynomial of A by (30.44). 


Now if we define P, to be the projection on %(42,) in the direction of the remaining 
U(A;),j#k, namely 


R(P,) = UA, ), K(P,) = U(A,) (30.45) 


jzk 


and we define B, tobe A -4,P, on &(4,) and 0 on (4,), j #k, then A has the spectral 


decomposition by a direct sum 
L 
A=) (4P, +B,) (30.46) 
j=1 
where 
2 
P; =P, 


PB =B P, =B, j=1L. L (30.47) 
B“ =0, 1<a, <d, 


PP, =0 
PB, =B,P, =0 jek, j,k=1,..„L (30.48) 
BB, =0, 


In general, an endomorphism B satisfying the condition 


B’ =0 (30.49) 


for some power a is called nilpotent. From (30.49) or from Theorem 30.9 the only eigenvalue of a 
nilpotent endomorphism is 0, and the lowest power a satisfying (30.49) is an integer a,1<a <N, 

such that t” is the characteristic polynomial of B and t° is the minimal polynomial of B. In view 
of (30.47) we see that each endomorphism B, in the decomposition (30.46) is nilpotent and can be 
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regarded also as a nilpotent endomorphism on %(4,). In order to decompose A further from 


(30.46) we must determine a spectral decomposition for each B,. This problem is solved in 


general as follows. 


First, we recall that in Exercise 26.8 we have defined a nilcylic endomorphism C to bea 
nilpotent endomorphism such that 


Cc" =0 but C40 (30.50) 


Where N is the dimension of the underlying vector space Y . For such an endomorphism we can 
find a cyclic basis {e,,...,e,} which satisfies the conditions 


C™'e,=0, Ce, =e,...,Ce, =e, , (30.51) 
or, equivalently 

C™'e, =e, Ce, =e,,...,Ce, =e, , (30.52) 
so that the matrix of C takes the simple form (26.19). Indeed, we can choose e, to be any vector 


such that C’'e, #0; then the set {e,,...,e, } defined by (30.52) is linearly independent and thus 


forms a cyclic basis for C. Nilcyclic endomorphisms constitute only a special class of nilpotent 
endomorphisms, but in some sense the former can be regarded as the building blocks for the latter. 
The result is made precise by the following theorem. 


Theorem 30.12. Let B be a nonzero nilpotent endomorphism of ¥ in general, say B satisfies the 
conditions 


B’ =0 but B*'+0 (30.53) 
for some integer a between 1 and N . Then there exists a direct sum decomposition for V : 


V=- OY, (30.54) 


and a corresponding direct sum decomposition for B (in the sense explained in Section 24): 
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B=B,+-:-+B,, (30.55) 


such that each B, is nilpotent and its restriction to ¥, is nilcyclic. The subspaces %,...,%, in the 


decomposition (30.54) are not unique, but their dimensions are unique and obey the following 
rules: The maximum of the dimension of %,...,%, is equal to the integer a in (30.53); the number 


N, of subspaces among ¥,...,¥,;, having dimension a is given by 


N, = N-dimK(B*") (30.56) 


More generally, the number N, of subspaces among ¥,...,¥,;, having dimensions greater than or 
equal to b is given by 


N, = dim K(B’)—dim K(B”") (30.57) 


for all b=1,...,a. In particular, when b=1, N, is equal to the integer M in (30.54), and (30.57) 
reduces to 


M =dimK(B) (30.58) 


Proof. We prove the theorem by induction on the dimension of W . Clearly, the theorem is valid 
for one-dimensional space since a nilpotent endomorphism there is simply the zero endomorphism 
which is nilcylic. Assuming now the theorem is valid for vector spaces of dimension less than or 
equal to N —1, we shall prove that the same is valid for vector spaces of dimension N . 


Notice first if the integer a in (30.53) is equal to N , then B is nilcyclic and the assertion is 
trivially satisfied with M =1, so we can assume that 1<a < N . By (30.53)2, there exists a vector 
e, €% such that Be, #0. As in (30.52) we define 


(30.59) 


Then the set {e,,...e,} is linearly independent. We put % to ‘be the subspace generated by 


{e,,...e,}. Then by definition dim% = a , and the restriction of B on % is nilcyclic. 
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Now recall that for any subspace of a vector space we can define a factor space (cf. Section 
11). As usual we denote the factor space of Y over % by Y/¥,. From the result of Exercise 11.5 


and (30.59), we have 
dimy/%=N-a<N-1 (30.60) 


Thus we can apply the theorem to the factor space Y/Y, . For definiteness, let us use the notation 
of Section 11, namely, if ve Y,, then V denotes the equivalence set of v in Y/¥%. This notation 
means that the superimposed bar is the canonical projection from Y to ⁄/% . From (30.59) it is 
easy to see that % is B-invariant. Hence if u and v belong to the same equivalence set, so do 


Bu andBv. Therefore we can define an endomorphism B on the factor space V/% , by 


Bv = Bv (30.61) 


for all ve% or equivalently for all Ve ¥/%, Applying (30.60) repeatedly, we have also 


B‘v = B‘v (30.62) 


for all integers k. In particular, B is nilpotent and 


B’=0 (30.63) 


By the induction hypothesis we can then find a direct sum decomposition of the form 
V[K=UB--OY, (30.64) 


for the factor space Y/¥ and a corresponding direct sum decomposition 


B=F +F (30.65) 
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for B. In particular, there are cyclic bases in the subspaces %,..., %, for the nilcyclic 


endomorphisms which are the restrictions of F>,” F, to the corresponding subspaces. For 


definiteness, let {fis 6} be a cyclic basis in %, say 


B’f, =0, B’'f, =f,,..., Bf, = f, , (30.66) 


From (30.63), is necessarily less than or equal to a. 


From (30.66); and (30.62) we see that B’f, belongs to % and, thus can be expressed as a 


linear combination of {e,,...,e,}, say 


B'f, = ae, +--+ e, =(@,B°'+-:-+a,,B+a,De, (30.67) 


Now there are two possibilities: (i) B’f, = 0 or (ii) B’f, +0. In case (i) we define as before 


Bf, =,,....BE, =f, (30.68) 
Then {f,,...,f,} is a linearly independent set in Y , and we put % to be the subspace generated by 
ff, ai , On the other hand, in case (ii) from (30.53) we see that b is strictly less than a; 


moreover, from (30.67) we have 


0=B‘f, =(a,_,,,B° + +a, Be, =@,_,,€, tte, (30.69) 
which implies 
= Qp = EAO (30.70) 
or equivalently 
B'f, =(a,B°'+---+a,_,B’)e, (30.71) 


Hence we can choose another vector f,' in the same equivalence set of f, by 
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f, =f, -(@B°°'+---+a@,_,De, =f, -epa G8 (30.72) 


a-—b~a 


which now obeys the condition B’f,'=0, and we can proceed in exactly the same way as in case 


(i). Thus in any case every cyclic basis {f,,...,f,| for % gives rise to a cyclic set {f,,...,f,} in %. 


Applying this result to each one of the subspaces %,...,%, we obtain cyclic sets {f,,...,f,}, 
{g8}, and subspaces %,...,¥,,, generated by them in Y . Now it is clear that the union of 
{e,,..,€,}, {frof h {85-8} form a basis of Y since from (30.59), (30.60), and (30.64) 


coy 


there are precisely N vectors in the union; further, if we have 
aet +a Le tA Et epa tgga (30.73) 
then taking the canonical projection to ¥/% yields 
B, fi ++, Pye Peery Bota | 
which implies 


B= = Ban Hr HY SH = 9 


and substituting this result back into (30.73) yields 


Thus ¥ has a direct sum decomposition given by (30.54) with M = p+1 and B has a 
corresponding decomposition given by (30.55) where B.,...,B,, have the prescribed properties. 


Now the only assertion yet to be proved is equation (30.57). This result follows from the 
general rule that for any nilcylic endomorphism C on a L-dimensional space we have 
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1 for 1<k<L 


dim K(C*) — dim(C*") = 
ee ) : for k> L 


Applying this rule to the restriction of B, to ⁄ forall j =1,...,M_and using the fact that the 
kernel of B“ is equal to the direct sum of the kernel of the restriction of (B Di forall j=1....M, 
prove easily that (30.57) holds. Thus the proof is complete. 


In general, we cannot expect the subspaces %,...,%, in the decomposition (30.54) to be 
unique. Indeed, if there are two subspaces among ¥,...,¥%, having the same dimension, say 


dim% = dim ¥, = a, then we can decompose the direct sum % ® ¥, in many other ways, e.g., 


V@OV,=ViOV2 (30.74) 


and when we substitute (30.74) into (30.54) the new decomposition 


V=V DV DLD- -DK 
possesses exactly the same properties as the original decomposition (30.54). For instance we can 


define ¥ and ¥, to be the subspaces generated by the linearly independent cyclic set feiss e } 


and eit; where we choose the starting vectors é, and f, by 


č, =ae, + pf, f =7e, +ôf, 
provided that the coefficient matrix on the right hand side is nonsingular. 


If we apply the preceding theorem to the restriction of A—A,I on % , we see that the 


inequality (30.41) is the best possible one in general. Indeed, (30.41)2 becomes an equality if and 
only if % has the decomposition 


U =U, D:D Uu (30.75) 


where the dimensions of the subspaces %,,...,.%, are 
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dim%, =a,, dim&, =---=dim&%, =1 


If there are more than one subspaces among %,,...,%,, having dimension greater than one, then 
(30.41)> is a strict inequality. 


The matrix of the restriction of A — 4,1 to % relative to the union of the cyclic basis for 
U,,,....%,, has the form 


[A 0 


A, = l (30.76) 


P 0 
a 
Ay = o (30.77) 
1 
|0 i; 


Substituting (30.76) into (24.5) yields the Jordan normal form for A: 
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M(A)= (30.78) 


The Jordan normal form is an important result since it gives a geometric interpretation of an 
arbitrary endomorphism of a vector space. In general, we say that two endomorphisms A and A' 
are similar if the matrix of A relative to a basis is identical to the matrix of A' relative to another 
basis. From the transformation law (22.7), we see that A and A' are similar if and only if there 
exists a nonsingular endomorphism T such that 


A'=TAT“" (30.79) 


Clearly, (30.79) defines an equivalence relation on Y(V;W). We call the equivalence sets relative 
to (30.79) the conjugate subsets of Y(¥;W). Now foreach Ac Y(¥;¥%) the Jordan normal form 


of A is a particular matrix of A and is unique to within an arbitrary change of ordering of the 
various square blocks on the diagonal of the matrix. Hence A and A'are similar if and only if 
they have the same Jordan normal form. Thus the Jordan normal form characterizes the conjugate 
subsets of Y(V;V) 


Exercises 


30.1 Prove Theorem 30.5. 
30.2 Prove the general case of Theorem 30.7. 


30.3 Let U be an unitary endomorphism of an inner product space YW. Show that U has a 
diagonal form. 


30.4 If% isa real inner product space, show that an orthogonal endomorphism Q in general 
does not have a diagonal form, but it has the spectral form 
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pi = 
1 0 
-1 
M(Q)= 
-1 
cos, -sin 0, 
0 sin, cosé, 
cos, -sin 0, 

| sin@,  cosd, | 


30.5 


30.6 


where the angles 6@,,...,0, may or may not be distinct. 


Determine whether the endomorphism A whose matrix relative to a certain basis is 


1 1 0 
M(A)=|0 1 0 
6:41 a 


can have a diagonal matrix relative to another basis. Does the result depend on whether the 
scalar field is real or complex? 


Determine the Jordan normal form for the endomorphism A whose matrix relative to a 
certain basis is 


M(A)= 


O oOo e 
or re 
PRR 


Chapter 7 


TENSOR ALGEBRA 


The concept of a tensor is of major importance in applied mathematics. Virtually every 
discipline in the physical sciences makes some use of tensors. Admittedly, one does not always 
need to spend a lot of time and effort to gain a computational facility with tensors as they are 
used in the applications. However, we take the position that a better understanding of tensors is 
obtained if we follow a systematic approach, using the language of finite-dimensional vector 
spaces. We begin with a brief discussion of linear functions on a vector space. Since in the 
applications the scalar field is usually the real field, from now on we shall consider real vector 
spaces only. 


Section 31. Linear Functions, the Dual Space 


Let Y bea real vector space of dimension N. We consider the space of linear functions 
L(V; R) from % into the real numbers #. By Theorem 16.1, dim Z (%; Z) = dim% =N . 
Thus ¥ and £(¥;@) are isomorphic. We call ¥(¥;@) the dual Space of Y, and we denote it 


by the special notation Y". To distinguish elements of Y from those of ¥*, we shall call the 
former elements vectors and the latter elements covectors. However, these two names are 
strictly relative to each other. Since Y° is a N-dimensional vector space by itself, we can apply 
any result valid for a vector space in general to Y* as well as to Y. In fact, we can even define 


a dual space (x Y for Y* just as we define a dual space ⁄* for ⁄. In order not to introduce 


too many new concepts at the same time, we shall postpone the second dual space(¥ ‘) until the 
next section. Hence in this section Y shall be a given N-dimensional space and ¥~ shall denote 
its dual space. As usual, we denote typical elements of Y by u,v, w,.... Then the typical 

elements of Y*are denoted by u',v ,w’,.... However, it should be noted that the asterisk here is 


strictly a convenient notation, not a symbol for a function from Y to ¥*. Thus u’ is not related 
in any particular way to u. Also, for some covectors, such as those that constitute a dual basis to 
be defined shortly, this mutation becomes rather cumbersome. In such cases, the notation is 
simply abandoned. For instance, without fear of ambiguity we denote the null covector in ⁄* by 
the same notation as the null vector in Y, namely 0, instead of 0°. 


If v ~¥., then vř isa linear function fromY to &, i.e., 
Vy >R 


203 
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such that for any vectors u, v,e ¥ and scalars a, p,e 2 


v‘ (aut Bv)=av' (u)+ Sv" (v) 


Of course, the linear operations on the right hand side are those of # while those on the left hand 
side are linear operations in Y. For a reason the will become apparent later, it is more 


convenient to denote the value of v* at v by the notation (vv). Then the bracket ( , ) 


operation can be viewed as a function 
(,):VxV >R 


It is easy to verify that this operation has the following properties: 


(i) (av’ + Bu’,v) = a(v',v)+ 8 (u,v) 
(ii) (v',au+ fv) =a(v",u)+ (vv) (1.1) 
(iii) For any given v“, (v’, v) vanishes for all v € ⁄ if and only if v* = 0. 


(iv) Similarly, for any given v, (v’, v) vanishes for all v* e ⁄” if and only if v = 0. 


The first two properties define G) to be a bilinear operation on Y" x% , and the last two 
properties define () to be a definite operation. These properties resemble the properties of an 


inner product, so that we call the operation (;) the scalar product. As we shall see, we can 


define many concepts associated with the scalar product similar to corresponding concepts 
associated with an inner product. The first example is the concept of the dual basis, which is the 
counterpart of the concept of the reciprocal basis. 


If {e,,...e,} is a basis for Y , we define the dual basis to be a basis {e’,. e") for 


¥* such that 


(e!,e,)=6/ (31.2) 
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for all i, j =1,...,N. The reader should compare this condition with the condition (14.1) that 


defines the reciprocal basis. By exactly the same argument as before we can prove the following 
theorem. 


Theorem 31.1. The dual basis relative to a given basis exists and is unique. 


Notice that we have dropped the asterisk notation for the covector e’ in a dual basis; the 
superscript alone is enough to distinguish {e’,...e" \ from {e,,...e,}. However, it should be 


kept in mind that, unlike the reciprocal basis, the dual basis is a basis for WY“, not a basis for ¥. 
In particular, it makes no sense to require a basis be the same as its dual basis. This means the 
component form of a vector v€% relative to a basis {e,,...e,}, 


v=v'e, (81.3) 


must never be confused with the component form of a covector v* e ⁄* relative to the dual 
basis, 


v =ve' (31.4) 


In order to emphasize the difference of these two component forms, we call v' the contravariant 
components of vand v, the covariant components of v*. A vector has contravariant 


components only and a covector has covariant components only. The terminology for the 
components is not inconsistent with the same terminology defined earlier for an inner product 
space, since we have the following theorem. 


Theorem 31.2. Given any inner product on ¥, there exists a unique isomorphism 
G: Y >v” (31.5) 
which is induced by the inner product in such a way that 


(Gv,w)=v-w, v,wEeEY (31.6) 
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Under this isomorphism the image of any orthonormal basis fi, PEAR i is the dual basis 


{i',....i}, namely 
Gi, =i‘, k=1,...,N (31.7) 


and, more generally, if {e,,. s€ ey is an arbitrary basis, then the image of its reciprocal basis 


fæ, z E") , is the dual basis fe. f e"), namely 


Ge‘ =e", k =1,...,N (31.8) 


Proof. Since we now consider only real vector spaces and real inner product spaces, the right- 
hand side of (31.6), clearly, is a linear function of w foreach ve ¥Y. Thus G is well defined by 
the condition (31.6). We must show that Gis an isomorphism. The fact that G is a linear 
transformation is obvious, since the right-hand side of (31.6) is linear in v foreach we ¥. 

Also, G is one-to-one because, from (31.6), if Gu = Gv, then u-w=v-w for all w and thus 


u=v. Now since we already know that dim Y = dim, any one-to-one linear transformation 


from Y to Y is necessarily onto and hence an isomorphism. The proof of (31.8) is obvious, 
since by the definition of the reciprocal basis we have 


e-e =ô; i, j=1,...,N 
and by the definition of the dual basis we have 
(e'e, )=5;, i, j=1,...,N 
Comparing these definitions with (31.6), we obtain 
(Ge =e,)=(e',e,), i j=1,...,N 


which implies (31.8) because {e,,...e,} is a basis of Y. 
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Because of this theorem, if a particular inner product is assigned on ¥, then we can 


identify Y with Y* by suppressing the notation for the isomorphisms G and G™. In other 
words, we regard a vector v also as a linear function on ¥: 


(v,w) =V-'W (31.9) 


According to this rule the reciprocal basis is identified with the dual basis and the inner product 
becomes the scalar product. However, since a vector space can be equipped with many inner 
products, unless a particular inner product is chosen, we cannot identify Y with Y~ in general. 
In this section, we shall not assign any particular inner product in Y,so Y and Ware different 
vector spaces. 


We shall now derive some formulas which generalize the results of an inner product 
space to a vector space in general. First, ifv €% and v* € ⁄“ are arbitrary, then their scalar 
products (v', v) can be computed in component form as follows: Choose a basis {e,} and its 


dual basis fe’) for Y and %⁄”, respectively, so that we can express vand v* in component form 
(31.3) and (31.4). Then from (31.1) and (31.2) we have 


(v', v) = (ve',v'e;) =vv! (e'e) =vviði =vv' (31.10) 
which generalizes the formula (14.16). Applying (31.10) to v* =e’, we obtain 


(e',v) =v! (31.11) 


ve, )=V, (31.12) 
(vei) 


which generalizes the formula (14.15)2 
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Next recall that for inner product spaces Y and % we define the adjoint A* of a linear 


transformation A :% > % to be a linear transformation A* :% —> ¥ such that the following 
condition [cf.(18.1)] is satisfied: 


u-Av=A'u-v, uc, vev 
If we do not make use of any inner product, we simply replace this condition by 
(u*, Av) = (Aw, v), u cY, vev (31.13) 
then A* is a linear transformation from ZY tov’, 
A U ov” 


and is called the dual of A. By the same argument as before we can prove the following 
theorem. 


Theorem 31.3. For every linear transformation A: ¥ — % there exists a unique dual 
A* :%* +> V satisfying the condition (31.13). 


If we choose a basis {e,,...e,} for Y anda basis {b,,...,b,,} for Y and express the 
linear transformation A by (18.2) and the linear transformation A” by (18.3), where fb") and 


fe} are now regarded as the dual bases of {b,} and {e,}, respectively, then (18.4) remains 
valid in the more general context, except that we now have 


AY =A", (31.14) 


since we no longer consider complex spaces. Of course, the formulas (18.5) and (18.6) are now 
replaced by 


(b%,Ae,) =A’, (31.15) 
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(Abe) =A (31.16) 


respectively. 


For an inner product space the orthogonal complement of a subspace Wof Y isa 
subspace %+ given by [cf.(13.2)] 


y =fvju-v=0 forall ue u} 


By the same token, if Y is a vector space in general, then we define the orthogonal complement 
of Y to be the subspace %* of Y* given by 


y= fv (v'u)=0 forallue u} (81.17) 


In general if ve % and v* € ⁄* are arbitrary, then vandv* are said to be orthogonal to each 
other if (v', v) =0. We can prove the following theorem by the same argument used previously 


for inner product spaces. 


Theorem 31.4. If % is a subspace of ¥, then 


dim X + dim X+ = dimyv (31.18) 


However, Y is no longer the direct sum of Yand%* since + is a subspace of ¥*, nota 
subspace of ¥. 


Using the same line of reasoning, we can generalize Theorem 18.3 to the following. 


Theorem 31.5. If A:¥ > % is a linear transformation, then 
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K(A*)=R(A) (31.19) 


and 
K(A)=R(A’) (31.20) 


Similarly, we can generalize Theorem 18.4 to the following. 
Theorem 31.6. Any linear transformation A and its dual A* have the same rank. 


Finally, formulas for transferring from one basis to another basis can be generalized from 
inner product spaces to vector spaces in general. If {e,,...e,}and {@,,...é,} are bases for ¥, 


then as before we can express one basis in component form relative to another, as shown by 
(14.17) and (14.18). Now suppose that fe',. . e") and {@',...8" are the dual bases of {e,,...e,} 


and {é,,...¢,}, respectively. Then it can be verified easily that 

ê= T'e, el = Te" (31.21) 
where T,’ and T! are defined by (14.17) and (14.18), i.e., 

e =f, ê =Tie (31.22) 


From these relations if v€% and v* e ⁄* have the component forms (31.3) and (31.4) relative 
to {e,} and fe’) and the component forms 


v=0ê, (31.23) 


and 


v =08e (31.24) 


6, =T,v, (31.25) 
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ô! = f'o" (31.26) 
perp (31.27) 
v, = fô, (31.28) 


which generalize the formulas (14.24)-(14.27), respectively. 


Exercises 
31.1 If A:¥ >% and B: Y—- Ware linear transformations, show that 
(BA) =A‘B" 
31.22 If A:¥ >Wand B: ¥Y > % are linear transformations, show that 
(aA + BB) =aA" + SB" 
and that 
A‘ =0<A=0 

These two conditions mean that the operation of taking the dual is an isomorphism 

He L(VUIE(UY). 

31.3 A:¥— Vis an isomorphism, show that A* :%* —> ¥“is also an isomorphism; 
moreover, 

(a) =(a) 

31.4 If Whas the decomposition ⁄ =% OY, and if P:¥ —>% is the projection of Y on 
y along %, show that Y*has the decomposition Y* = Y% ® %, and that P*: ¥* > Y “is 
the projection of Y*on &%* along %*. 

31.5 If % and% are subspaces of ¥, show that 


(UU) =U NUY and (UNUỌY) =Y +u% 


212 Chap. 7 ° TENSOR ALGEBRA 


31.6 Show that the linear transformation G: Y + ¥“ defined in Theorem 31.2 obeys the 
condition 


(Gv,w) = (Gw, v) 
31.7 Show that an inner product on Wis induced by an inner product on ¥ by the formula 
vw =G'v-G'w (31.29) 


where G is the isomorphism defined in Theorem 31.2. 
31.8 If {e,} is a basis for Y and fe) is its dual basis in ⁄*, show that {Ge,} is the reciprocal 


basis of fe’) with respect to the inner product on “defined by (31.29) in the preceding 


exercise. 
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Section 32. The Second Dual space, canonical Isomorphisms 


In the preceding section we defined the dual space ““ of any vector space Y to be the 
space of linear functions ¥ (x ;2) from ¥ to Z. By the same procedure we can define the dual 


space (v") of y“ by 
(Y =2(%';2)=2(L(V;2);2) (32.1) 


For simplicity let us denote this space by ¥™ , called the second dual space of W. Of course, the 
dimension of ¥™ is the same as that of Y, namely 


dim¥ =dimw* = dimyv™ (32.2) 


Using the system of notation introduced in the preceding section, we write a typical element of 
¥™ by v™“. Then vřis a linear function on W~ 


Viiv >R 


Further, for each v* € ⁄* we denote the value of ⁄*™ at v* by (v", v`). The (,) operation is now 


a mapping from ¥™ x ¥" to Zand possesses the same four properties given in the preceding 
section. 


Unlike the dual space ¥*, the second dual space ¥™ can always be identified as 
Y without using any inner product. The isomorphism 


JV >y” (32.3) 


is defined by the condition 


(Jv,v')=(v',v), vev, vev“ (32.4) 


Clearly, J is well defined by (32.4) since for each v € ¥ the right-hand side of (32.4) is a linear 
function of v*. To see that J is an isomorphism, we notice first that J is a linear transformation, 
because for each v* e ¥" the right-hand side is linear in v. Now J also one-to-one, since if 

Jv = Ju, then (32.4) implies that 


(v',v)=(v'u), vey“ (32.5) 
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which then implies u = v. From (32.2), we conclude that the one-to-one linear transformation J 
is onto, and thus J is an isomorphism. We summarize this result in the following theorem. 


Theorem 32.1. There exists a unique isomorphism J from % to¥™ satisfying the condition 
(32.4). 


Since the isomorphism J is defined without using any structure in addition to the vector 
space structure on ¥, its notation can often be suppressed without any ambiguity. We shall 


adopt such a convention here and identify any v e% as a linear function on W* 
ViV >A 
by the condition that defines J , namely 


(v,v") = (v', v), forall v* e ⁄* (32.6) 


In doing so, we allow the same symbol v to represent two different objects: an element of the 
vector space Y and a linear function on the vector space ¥“, and the two objects are related to 
each other through the condition (32.6). 


To distinguish an isomorphism such as J, whose notation may be suppressed without 
causing any ambiguity, from an isomorphism such as G, defined by (31.6), whose notation may 
not be suppressed, because there are many isomorphisms of similar nature, we call the former 
isomorphism a canonical or natural isomorphism. Whether or not an isomorphism is canonical 
is usually determined by a convention, not by any axioms. A general rule for choosing a 
canonical isomorphism is that the isomorphism must be defined without using any additional 
structure other than the basic structure already assigned to the underlying spaces; further, by 
suppressing the notation of the canonical isomorphism no ambiguity is likely to arise. Hence the 
choice of a canonical isomorphism depends on the basic structure of the vector spaces. If we 
deal with inner product spaces equipped with particular inner products, the isomorphism G can 
safely be regarded as canonical, and by choosing G to be canonical, we can achieve much 
economy in writing. On the other hand, if we consider vector spaces without any pre-assigned 
inner product, then we cannot make all possible isomorphisms G canonical, otherwise the 
notation becomes ambiguous. 


It should be noticed that not every isomorphism whose definition depends only on the 


basis structure of the underlying space can be made canonical. For example, the operation of 
taking the dual: 


HLVUIL(UY) 
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is defined by using the vector space structure of Y and % only. However, by suppressing the 
notation *, we encounter immediately much ambiguity, especially when % is equal to YW. 
Surely, we do not wish to make every endomorphism A: ¥ — ¥Y self-adjoint! Another example 
will illustrate the point even clearer. The operation of taking the opposite vector of any vector is 
an isomorphism 

“VV 


which is defined by using the vector space structure alone. Evidently, we cannot suppress the 
minus sign without any ambiguity. 


To test whether the isomorphism J can be made a canonical one without any ambiguity, 


we consider the effect of this choice on the notations for the dual basis and the dual of a linear 
transformation. Of course we wish to have {e,}, when considered as a basis for ⁄**, to be the 


dual basis of fe'}, and A, when considered as a linear transformation from %%* to Y™, to be the 


dual of A*. These results are indeed correct and they are contained in the following. 


Theorem 32.2. Given any basis {e,} for Y, then the dual basis of its dual basis fe) is {Je,}. 


Proof. This result is more or less obvious. By definition, the basis {e,} and its dual basis fe’) are 


related by 
(e, e a =6, 
for i, j=1,...,N. From (32.4) we have 
(Je,,e') = (e',e,) 
Comparing the preceding two equations, we see that 
(J e pe) =6, 
which means that {Je,,...,Je,} is the dual basis of {e',...e"}. 


Because of this theorem, after suppressing the notation for J we say that {e,} and fe’ \ are 


dual relative to each other. The next theorem shows that the relation holds between Aand A*. 
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Theorem 32.3. Given any linear transformation A :% — %, the dual of its dual A* is J,AJ;. 
Here J, denotes the isomorphism from ¥ to ¥™ defined by (32.4) and J, denotes the 


isomorphism from % to Y™ defined by a similar condition. 
Proof. By definition A and A* are related by (31.13) 
(w, Av) = (AW, v) 
for all u* € %*,ve%. From (32.4) we have 
(u’, Av) = (J,Av,u’), (a'w, v) = (J,v,A'u') 
Comparing the preceding three equations, we see that 


(J,AJ; (J,v),u") = (J v, A'u) 


Since J, is an isomorphism, we can rewrite the last equation as 
=: kk * kk * * 
(JAJ; v”, u) =(v",A’u' ) 


Because v" e %⁄™ and u* e %* are arbitrary, it follows that J,AJ7 is the dual of A*. So if we 


suppress the notations for J, andJ,, then A and A* are the duals relative to each other. 


A similar result exists for the operation of taking the orthogonal complement of a 
subspace; we have the following result. 


Theorem 32.4. Given any subspace % of ¥, the orthogonal complement of its orthogonal 
complement % is J(%). 


We leave the proof of this theorem as an exercise. Because of this theorem we say that 
% and Y“ are orthogonal to each other. As we shall see in the next few sections, the use of 
canonical isomorphisms, like the summation convention, is an important device to achieve 
economy in writing. We shall make use of this device whenever possible, so the reader should 
be prepared to allow one symbol to represent two or more different objects. 


The last three theorems show clearly the advantage of making J a canonical 
isomorphism, so from now on we shall suppress the symbol for J. In general, if an isomorphism 
from a vector space ¥ to a vector space % is chosen to be canonical, then we write 
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V=U (32.7) 


In particular, we have 


V =v" (32.8) 


Exercises 


32.1 Prove theorem 32.4. 
32.2 Show that by making Ja canonical isomorphism essentially we have identified W with 


WV” o and WV with V, y... . So a symbol vorv_, in fact, represents 
infinitely many objects. 
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Section 33. Multilinear Functions, Tensors 


In Section 31 we discussed the concept of a linear function and the concept of a bilinear 
function. These concepts can be generalized in an obvious way to multilinear functions. In 
general if %,...,%, is a collection of vector spaces, then a s-linear function is a function 


A:Kx XK OE (33.1) 


that is linear in each of its variables while the other variables are held constant. If the vector 
spaces ¥,...,%, are the vector space ¥ or its dual space ¥", then Ais called a tensor on ¥. 
More specifically, a tensor of order (p, q) on ¥, where pand qare positive integers, is a (p+q)- 
linear function 


VFX XV EXV XX: XV OE (33.2) 


times times 
P q 


We shall extend this definition to the case p =q = 0 and define a tensor of order (0, 0) to be a 
scalar in #. A tensor of order ( p, 0) is a pure contravariant tensor of order p and a tensor of 


order (0,q)is a pure covariant tensor of order q. In particular, a vector v €% is a pure 


contravariant tensor of order one. This terminology, of course, is defined relative to a given 
vector space ¥ as we have explained in Section 31. If a tensor is not a pure contravariant tensor 


or a pure covariant tensor, then it is a mixed tensor, and for a mixed tensor of order ( p, q), pis 


the contravariant order and q is the covariant order. 


For definiteness, we denote the set of all tensors of order ( p, q) on ¥ by the symbol 
J,’ (V). However, the set of pure contravariant tensors of order p shall be denoted simply by 
J” (V) and the set of pure covariant tensors of order q shall be denoted simply 7,(¥%). Of 


course, tensors of order (0,0) form the set #and 
T'(V)=%, F(V)=V" (33.3) 
Here we have made use of the identification of WY with Y™ as explained in Section 32. 


We shall now give some examples of tensors. 


Example 1. If Ais an endomorphism of ¥, then we define a function Â: xY >F by 


Sec. 33 ° Multilinear Functions, Tensors 219 
A(v', v) = (v', Av) (33.4) 


for all v* e Y°and ve Y. Clearly Ais bilinear and thus Ac %'(¥). As we shall see later, it is 
possible to establish a canonical isomorphism from ¢(Y,%)to %'(¥)in such a way that the 


endomorphism A is identified with the bilinear function A. Then the same symbol A shall 


represent two objects, namely, an endomorphism of ¥ and a bilinear function of Y" x ¥. Then 
(33.4) becomes simply 


A(v‘,v)=(v’, Av) (33.5) 


Under this canonical isomorphism, the identity automorphism of % is identified with the scalar 
product, 


I(v’,v) =(v’,Iv)=(v’,v) (33.6) 
Example 2. If vis a vector in Y andv is a covector in %⁄*, then we define a function 
VOV: VČ XV >B (33.7) 
by 
v®v'(u’,u)=(u’,v)(v’,u) (33.8) 
for all u € Y",ue¥Y. Clearly, v® vis a bilinear function, so v@v' e Z'(%). If we make 


use of the canonical isomorphism to be established between 7,'(%)and £(¥;¥%), the tensor 


v®v_ corresponds to an endomorphism of % such that 
v@v (u’,u) = (u',v ® v'u) 
or equivalently 


(u’, v)(v',u) = (u’, v® v'u) 


forall u* € ¥",ueY. By the bilinearity of the scalar product, the last equation can be rewritten 
in the form 


220 Chap. 7 ° TENSOR ALGEBRA 


(u’,(v’,u)v) = (u’v @v'u) 
Then by the definiteness of the scalar product, we have 
(v',u)v =v® vu, forallue Y (33.9) 
which defines v & v*as an endomorphism of ¥. The tensor or the endomorphism v ® v’ is 
called the tensor product of vand v“. 


Clearly, the tensor product can be defined for arbitrary number of vectors and covectors. 
Let v,,...,v, be vectors in Y and vt,..., v? be covectors in ⁄*. Then we define a function 


VODOV, OVO OVI žo XV XV X-- XV > A 


p times q times 


by 


(33.10) 


for all u,,...,u,¢VY and u',... u° €%*. Clearly this function is (p +q)- linear, so that 


vne Ov, OV Q: Qv E TP (V), is called the tensor product of v,,...,v, andv',...,v‘. 


Having seen some examples of tensors on ¥Y, we turn now to the structure of the set 
T? (V). We claim that Z” (7) has the structure of a vector space and the dimension of 
T? (V) is equal to N (P*a) where N is the dimension of Y. To make J,’ (V) a vector space, 
we define the operation of addition of any A,B € 7,” (% ) and the scalar multiplication of A by 


ae&by 


(33.11) 


and 


(@A)(v',....V?,V-.-V,) =aA(v',..., V’, Vp- V4) (33.12) 
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respectively, for all v',....v? eV" and v,,...,V q€% . We leave as an exercise to the reader the 


proof of the following theorem. 


Theorem 33.1. 7," (% ) is a vector space with respect to the operations of addition and scalar 
multiplication defined by (33.11) and (33.12). The null element of 7,” (% ) ,of course, is the 


zero tensor 0: 


Ovu t VisesVy)=0 (33.13) 


x 
forall v',....v? €% and v,,..., V, €V. 


Next, we determine the dimension of the vector space 7,” (% ) by introducing the 
concept of a product basis. 


Theorem 33.2. Let {e,} and {e'| be dual bases for Y and¥*. Then the set of tensor products 


fe, -Oe Qet De, iyo essigs jeja SiN) (33.14) 


forms a basis for 7,” (%), called the product basis. In particular, 
dim Z? (4) = N9 (33.15) 


Proof. We shall prove that the set of tensor products (33.14) is a linearly independent generating 
set for Z’(¥). To prove that the set (33.14) is linearly independent, let 


r je, @:-@e, Be @--- Be" =0 (33.16) 
q P 


where the right-hand side is the zero tensor given by (33.13). Then from (33.10) we have 


fecal j j k 
0=A"?. pe O° Oe; Be -.Be(eh,...,.e'7,¢,,...€, | 
=q P q 


h 


—Ai'r. : (e, e. p(en e ee Jehe ) (33.17) 
Jydg h lp 1 q 


ii k ky oj j wok 
_— a'r SNS PS TEA E 


hdg i 
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which shows that the set (33.14) is linearly independent. Next, we show that every tensor 
AEJ? (% ) can be expressed as a linear combination of the set (33.14). We define 


p+ scalars (ai iwi ?hrlp Je : di =l,. . sN} by 
AR = A(et,...,e”,e,,...,e, | (33.18) 


Now we reverse the steps of equation (33.17) and obtain 
Av’ e @--@e Qei- eet (e",...e",e,...e, ) 
h- -jq h lp i: q 


=A(e",..e",e,,...€, } pem 


for all k,,...,k,, hs...l1, =1,...,N . Since A is multilinear and since {e,} and fe) are dual bases 


for Y and %”“*, the condition (33.19) implies that 


Avr e @---@e, Qet geh (v... VP, Vp. V) 


he--jq h 


=A(v',...,v?, Vi ay 


for all v,,...,v, €% and v',...,v? €%* and thus 


A=A'" e @--@e, Qie] (33.20) 


Now from Theorem 9.10 we conclude that the set (33.14) is a basis for 7” (%). 


Having determined the dimension of 7,’ (4), we can now prove that 7;'(¥ )is 


isomorphic to # (% A ), a result mentioned in Example 1. As before we define the operation 
LEVI) > T(V) 


by the condition (33.4). Since from (16.8) and (33.15), the vector space Z (4; ) and Z’ (7) 


are of the same dimension, namely N?,it suffices to show that the operation ^ is one-to-one. 
Indeed, let A=0. Then from (33.13) and (33.4) we have 


(v', Av) =0 
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for all v* € %⁄*andall ve ¥Y. Now since the scalar product is definite, this condition implies 


Av =0 


for all ve Y, and thus A=0. Consequently the operation * is an isomorphism. 


As remarked in Example 1, we shall suppress the notation Í by regarding the 
isomorphism it represents as a canonical one. Therefore, we can replace the formula (33.4) by 
the formula (33.5). 


Now returning to the space 7” (% ) in general, we see that a corollary of Theorem 33.2 is 


the simple fact that the set of all tensor products of the form (33.10) is a generating set for 
Ji (% ) . We define a tensor that can be represented as a tensor product to be a simple tensor. It 


should be noted, however, that such a representation for a simple tensor is not unique. Indeed, 
from (33.8), we have 


for any ve % and v €¥ since 
evo Fw n)= (2v) (vu) 
=(w,v)(v',u)= vOv (u’,u) 


for all wu’ andu. In general, the tensor product, as defined by (33.10), can be regarded as a 
mapping 


OV x XV XV" xX XV" > TP (V) (33.21) 
p times q times 
given by 
Q: (ven Vp Ve vt) =v, @--@v, Ov x Ov (33.22) 


forall v,,...,v, €% and v',...V’ eV". Itis easy to verify that the mapping ® is multilinear in 


the usual sense, i.e., 
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O(v e av + pu.. V) = a (VVV) 


(33.23) 
+ B@(v,,---u,....V") 


where av+ fu, v and u all take the same position in the argument of © but that position is 
arbitrary. From (33.23), we see that 


(av,)@v, ®@--@v, AVEO OV 


=v, @(av,)®---@v, @v ®---@v' 


We can extend the operation of tensor product from vectors and covectors to tensors in general. 
If Ac 7” (y) and Be 7; (7), then we define their tensor product A ®B to be a tensor of 


order (p+r,q+s)by 


af 
A®@B(v “eV VN 


x 


qts 


(33.24) 
= A(v',..5V?sVi, v,)B(v"", WP Na aNs) 
forall v',....v?" «VY and v,,...,V,,,€¥ . Applying this definition to arbitrary tensors 
AandB yields a mapping 
DTP (VSTE (V) > TEE (V) (33.25) 
Clearly this operation can be further extended to more than two tensor spaces, say 
BrP (Heer AISA a E) (33.26) 
in such a way that 
@(A,,...,A,)=A,@---@A, (33.27) 


where A; € 7,” (% ), i=1,...,k. It is easy to verify that this tensor product operation is also 


multilinear in the sense generalizing (33.23) to tensors. In component form 


(A@B)M Akh pi (33.28) 


hejas ji q Jane Jars 
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which can be generalized obviously for (33.27) also. From (33.28), or from (33.24), we see that 
the tensor product is not commutative, but it is associative and distributive. 


Relative to a product basis the component form of a tensor A € 7,” (% ) is given by 


(33.20) where the components of A can be obtained by (33.18). If we transfer the basis 
fe,} to{ê,} as shown by (31.21) and (31.22), then the components of A as well as the product 


basis relative to {e,} must be transformed also. The following theorem gives the transformation 


laws. 


Theorem 33.3. Under the transformation of bases (31.21) and (31.22) the product basis (33.14) 
for J,’ (¥) transforms according to the rule 


ê @---@é, @@' @...@e" 
P 


ere i (33.29) 
=T* al a edl teg @---Be, Qe! @..-@e' 
and the components of any tensor A € 7,” (% ) transform according to the rule. 
Ai !p _ pi pip ly qkiKp 
Vere i =T% a T; u A hh (33.30) 


The proof of these rules involves no more than the multilinearity of the tensor product &® 
and the tensor A. Many classical treatises on tensors use the transformation rule such as (33.30) 
to define a tensor. The next theorem connects this alternate definition with the one we used. 


Theorem 33.4. Given any two sets of N” scalars fab" 3 if and {ai os n related by the 
transformation rule (33.30), there exists a tensor A € 7?” (% ) whose components relative to the 
product bases of {e,} to{é,' are {ab ie n and {Ar iss | provided that the bases are related by 
(31.21) and (31.22). 


This theorem is obvious, since we can define the tensor A by (33.20); then the 
transformation rule (33.30) shows that the components of A relative to the product basis of 


{e,} are { Ait T . Thus a tensor A corresponds to an equivalence set of components, with 


the transformation rule serving as the equivalence relation. 


As an illustration of the preceding theorem, let us examine whether or not there exists a 
tensor whose components relative to the product basis of any basis are the values of the 
generalized Kronecker delta introduced in Section 20. The answer turns out to be yes, since we 
have the identify 
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OF ST NT T Oy (33.31) 


which follows from the fact that |T J and T | are the inverse of each other. We leave the 


proof of this identity as an exercise for the reader. From Theorem 33.4 and the identity (33.31), 
we see that there exist a tensor K, of order (r,r) such that 


K, =—6;".e, @---@e, Ge" @---@e" (33.32) 


I hje i 


1 
r 


relative to any basis {e,}. This tensor plays an important role in the next chapter. 


By the same line of reasoning, we may ask whether or not there exists a tensor whose 
components relative to the product basis of any basis are the values of the -symbols also 


introduced in Section 20. The answer turns out to be no, since we have the identities 


aa = det [ô | p ae T (33.33) 
and 
e's = det TIE efiet (33.34) 


which also follows from the fact that [T ] and [f] are the inverses of each other [cf. (21.8)]. 


Since these identities do not agree with the transformation rule (33.30), we can conclude that 
there exists no tensor whose components relative to the product basis of any basis are always the 
values of the e -symbols . In other words, if the values of the £ -symbols are the components of 


a tensor relative to the product basis of one particular basis fe}, then the components of the 
same tensor relative to the product basis of another basis generally are not the values of the 
€ —symbols , unless the transformation matrices [T ] and Ee | have unit determinant. 


In the classical treatises on tensors, the transformation rules (33.33) and (33.34), or more 
generally 


E E a T (33.35) 


jj q 


are used to define relative tensors. The exponent wand the coefficient ¢ on the right-hand side 
of (33.35) are called the weight and the parity of the relative tensor, respectively. A relative 
tensor is called polar if its parity has the value +1 in all transformations, while a relative tensor is 
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called axial if £ is equal to the sign of the determinant of the transformation matrix i K In 


particular, (33.33) shows that Can \ are the components of an axial covariant tensor of order N 


and weight —1, while (33.34) shows that ert are the components of an axial contravariant 


tensor of order N and weight +1. We shall see some more examples of relative tensors in the 
next chapter. 


Exercises 

33.1 Prove Theorem 33.1. 

33.2 Prove equation (33.31). 

33.3 Under the canonical isomorphism of ¥(Y;Y)with Z’ (7), show that an endomorphism 
A: ¥ >Y and its dual A°: Y — ¥Y* correspond to the same tensor. 

33.4 Define an isomorphism from ¥(2(%;Y);2( VY )) to T; (VY) independent of any 
basis. 

33.5 If Ae %(¥%), show that the determinant of the component matrix [ A; | of A defines a 
polar scalar of weight two, i.e., the determinant obeys the transformation rule 

A 2 
det| A, | = (det Es J) det] A, | 

33.6 Define isomorphisms from £(Y;4") to Zr), £(Y';%) to T?’ (¥), and from 
AGF) to Z’ (Y) independent of any basis. 

33.7 Given a relation of the form 

A ximB( KL m) a Cii, 

where (A, sn} and (Coat are components of tensors with respect to any basis, show 
that the set {B(k, l, m)} , likewise, can be regarded as components of a tensor, belonging 
to Z°(¥) in this case. This result is known as the quotient theorem in classical treatises 
on tensors. 

33.8 If Ac Z,,,(%), define a tensor T, Ae Z,,,(%) by 


pt+q pt+q 


U,,...5U, } (33.36) 
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for all W,- oU Vp o V EY. Show that the operation 


Toa? Fa lO Tal) (33.37) 


is an automorphism of 7,,,, (x ). We call T, the generalized transpose operation. What is the 


relation between the components of A and T,,A? 
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Section 34. Contractions 


In this section we shall consider the operation of contracting a tensor of order ( p, q) to 
obtain a tensor of order ( p —1,q—1), where p,q are greater than or equal to one. To define this 
important operation, we prove first a useful property of the tensor space 7,” (% ) . In the 


preceding section we defined a tensor of order ( p, q)to be a multilinear function 


AV *xX XV EXV X XV DG 


p times q times 


Clearly, this concept can be generalized to a multilinear transformation from the ( p, q) -fold 


Cartesian product of Y and %⁄”* to an arbitrary vector space %, namely 


ZV xX XV XV FX XV *EDY (34.1) 


p times q times 


The condition that Z be a multilinear transformation is similar to that for a multilinear function, 
namely, Z is linear in each one of its variables while its other variables are held constant, e.g., 
the tensor product ® given by (33.21) is a multilinear transformation. The next theorem shows 
that, in some sense, any multilinear transformation Z of the form (34.1) can be factored through 
the tensor product © given by (33.21). This fact is known as the universal factorization property 
of the tensor product. 


Theorem 34.1. If Zis an arbitrary multilinear transformation of the form (34.1), then there 
exists a unique linear transformation 


C: FP (V)>U (34.2) 
such that 
Z (vicsY via v) =C(v, 89: 8v, @v'®---@v') (34.3) 
for all v,...,v, €% andv',..., v Ey”. 
Proof. Since the simple tensors form a generating set of 7,” (% ) , if the linear transformation C 


satisfying (34.3) exists, then it must be unique. To prove the existence of C, we choose a basis 
{e,| for ¥ and define the product basis (33.14) for 7,’(%) as before. Then we define 
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C(e, @---@e, Qe’ ®:--@e") = Z(e,,...,¢,,€4,...e) (34.4) 


for all i,,...,i,, jis...» jg =L--.,N and extend Cto all tensors in %’(¥%) by linearity. Now it is 


clear that the linear transformation C defined in this way satisfies the condition (34.3), since both 
C and Z are multilinear in the vectors v,,...,V F and the covectors v‘,..., v", and they agree on 


the dual bases {e,} and fe’) for Y and ¥*, as shown in (34.4). Hence they agree on all 


1 q 
(Vr VpVoeod ). 


If we use the symbol ® to denote the multilinear transformation (33.21), then the 
condition (34.3) can be rewritten as 


Z=Co® (34.5) 


where the operation © on the right-hand side of (34.5) denotes the composition as defined in 
Section 3. Equation (34.5) expresses the meaning of the universal factorization property. In the 
modern treatises on tensors, this property is often used to define the tensor product and the tensor 
spaces. Our approach to the concept of a tensor is a compromise between this abstract modern 
concept and the classical concept based on transformation rules; the preceding theorem and 
Theorems 33.3 and 33.4 connect our concept with the other two. 


Having proved the universal factorization property of the tensor product, we can now 
define the operation of contraction. Recall that if ve¥Y and v* € %⁄”, then the scalar product 


(y, v’) is a scalar. Here we have used the canonical isomorphism given by (32.6). Of course, the 


operation 


(,):VxV >R 
is a bilinear function. By Theorem 34.1 ( , }can be factored through the tensor product 
D: VXV >T (V) 


i.e., there exists a linear map 


C: F'(V)>2 (34.6) 


such that 
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or, equivalently, 
(v,v')=C(v@v’) (34.7) 


forall ve% and v €%”. This linear function C is the simplest kind of contraction operation. 
It transforms the tensor space J;'(¥ )to the tensor space ZE (y )= F (VY) =2. 


In general if A € 7,’(¥ )is an arbitrary simple tensor, say 
A=v,®--@v, @v @---@v' (34.8) 
then for each pair of integers (i, j ) , where 1<i< p,1< j <q, we seek a unique linear 
transformation 


Ci: TP (Y) > TE (Y) (34.9) 


such that 


CA = (vi, v) E OVa OVa DOV, OV O Ov! OVO. Qv" (34.10) 


i+1 


for all simple tensors A. A more compact notation for the tensor product on the right-hand side 
is 


V8 Vv, 0 OVO vy! (34.11) 


Since the representation of a simple tensor by a tensor product is not unique, the existence of 
such a linear transformation (03 is by no means obvious. However, we can prove that c, does 
exist and is uniquely determined by the condition (34.10). Indeed, using the universal 
factorization property, we can prove the following theorem. 


Theorem 34.2. A unique contraction operation Ci satisfying the condition (34.10) exists. 


Proof. We define a multilinear transformation of the form (34.1) with Y = Z5 (7) by 


(34.12) 
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forall v,,....v, €% and v',...v’e¥W*. Then by Theorem 34.1 there exists a unique linear 
1 p y q 


transformation Ci of the form (34.9) such that (34.10) holds, and thus the proof is complete. 


Next we express the contraction operation in component form. 
Theorem 34.3. If A € Z” (7) is an arbitrary tensor of order ( p,q), then in component form 


relative to any dual bases {e,} and fe’) we have 


(cia) = Alt kp (34.13) 


heed sale Fee dpatl jt dg 


Proof. Since the contraction operation is linear applying it to the component form (33.20), we 
get 


Casa 5. (e’,e,, Je, BB, De, Oe! QE Be" (34.14) 


which means nothing but the formula (34.13) because we have (e! : e, ) = 6 ; 


In particular, for the special case C given by (34.6) we have 


C(A)=A' (34.15) 


for all Ae Z'(Y). If we now make use of the canonical isomorphism 
I(Y)2 (0:9) 


we see that C coincides with the trace operation defined by (19.8). 


Using the contraction operation, we can define a scalar product for tensors. If 
AEJ? (V) and Be 7,1(V%), the tensor product A&B is a tensor in 7,?"*(¥) and is defined 


by (33.24). We apply the contraction 


C=C] o..-0C] oC! o...o C! (34.16) 


q+1 g q+1 


q times p times 


to A QB, then the result is a scalar (A,B), namely 
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(A,B)=C(AQB) (34.17) 


called the scalar product of AandB. Itis a simple matter to see that ( , ) is a bilinear and 


definite function in the sense similar to those properties of the scalar product of 
vand v* explained in Section 31. We can use the bilinear function 


(Gp F°(Y)xI(%) 4 (34.18) 


to identify the space ,”(¥) with the dual space 7; (4 )* of the space 7,1(Y) or equivalently, 
we can define the dual space 7,"(¥%) abstractly as usual and then introduce a canonical 
isomorphism from 7,?(¥)to 7,1(¥ y through (34.17). Thus we write 


AO EEM CA (34.19) 


q 


Of course we shall also identify the second dual space 7," (% j with 7," (% ) as explained in 
general in Section 32. Hence, we have 


GV) 2INS (34.20) 


q 


which follows also by interchanging pandq in (34.19). From (34.13) and (34.16), the scalar 
product (A,B) is given by the component form 


(A,B) = Ae” 


Bre 
ha iniy 


(34.21) 


relative to any dual bases {e,} and fe } for Y and ¥*. 


Exercises 


34.1 Show that 
(A,v, 88v, OV O: Ov’) = Avi, V, VV) 


forall Ac ZP (Y), Vps vg E% and vy,w ev". 


34.2 Give another proof of the quotient theorem in Exercise 33.7 by showing that the 
operation 


234 


34.3 


34.4 
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defined by 
(JA)B =C; o---0€, o C1, 00C (A@B) 
s times r times 


forall Ac Z,"(¥) and Be 7” (7) is an isomorphism. Since there are many such 


isomorphisms by choosing contractions of pairs of indices differently, we do not make 
any one of them a canonical isomorphism in general unless stated explicitly. 
Use the universal factorization property and prove the existence and the uniqueness of the 


generalized transpose operation T, defined by (33.36) in Exercise 33.8. Hint: Require 
T,,to be an automorphism of 7,,,(¥Y) such that 


T (VO OV @u' ®---@u")=u' ®--- Ou" @v'®---@v? (34.22) 


for all simple tensors v' ®---®@v’? @u'®@---@u'e FZ (V). 


pt+q 


fe, C Qet } is the product basis for 7,’ (% ) induced by the dual bases 


{e, and fe) ; construct its dual basis in 7° (/) with respect to the scalar product 
defined by (34.17). 
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Section 35. Tensors on Inner Product Spaces 


The main result of this section is that if Y is equipped with a particular inner product, 
then we can identify the tensor spaces of the same total order by means of various canonical 


isomorphisms, a special case of these being the isomorphism G from % to¥”, which we have 
introduced in Section 31 [cf. (31.5)]. 


Recall that the isomorphism G: ⁄ —+¥%* is defined by the condition [cf. (31.6)] 


(Gu, v) = (Gv,u) =(u-v), uve% (35.1) 


In general, if Ais a simple, pure, contravariant tensor of order p , say 


A=v,8---@v, 


then we define the pure covariant representation of A to be the tensor 


G’A =Gv, ®--@Gv, € 7,(Y) (35.2) 


By linearity, G” can be extended to all of 7?(Y). Equation (35.2) means that 


G’A(u,,...,u, )=(Gv,,u,)--(Gv,,u, ) =(v, -u,)--(v, u, ) (35.3) 


for all u,,....u,, €% . Equation(35.3) generalizes the condition (35.1) from v to v,@---@v,. 


Of course, because the tensor product is not one-to-one, we have to show that the covariant 
tensor G’A does not depend on the representation of A. This fact follows directly from the 
universal factorization of the tensor product. Indeed, if we define a p-linear map 


Z: VX xV > T (Y) 
ue“ 


p times 
by 


Z(v,,-..,V,) =Gv, ®---@ Gv, (35.4) 


then Z can be factored through the tensor product ®, i.e., there exists a unique linear 
transformation G” from 7?” (y) to 7, (7%), 
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G”:7” (Y )>T, (V) (35.5) 
such that 
Z=G’”°0&® 
or, equivalently, 
Z(v,,-.-:V,)=G?(v,@--@v,)=G'A (35.6) 


Comparing (35.6) with (35.4) we see that G” obeys the condition (35.2), and thus G?A is well 
defined by (35.3). 


It is easy to verify that G” is an isomorphism. Indeed (G° J is the unique linear 


transformation from 7; (%) to 7? (%) such that 
(G’) (v 8--8v)=G7v' 8-067 (35.7) 


forall v',....v? €%*. Thus G” makes 7’(¥) and 7,(¥) isomorphic, just as G makes Y 
and ¥* isomorphic. In fact, G =G'. 


Clearly, we can extend the preceding argument to mixed tensor spaces on ¥ also. If Ais 
a mixed simple tensor in 7,"(Y), say 


A=v,®--@v, @u' ®@---@u' (35.8) 


then we define the pure covariant representation of A to be the tensor 


G?A = Gv, ®---@Gv, Qu- Qu’ (35.9) 


and G? is an isomorphism from 7” (4) to J,,,(%) 


G?:F?(V) 3 T, (7) (35.10) 


pt+q 


Indeed, its inverse is characterized by 
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(G?) (v 8---9v° @u'®--.@u")=G"v'@--@Gv’ @u'®--@u" (35.11) 


for all v‘,...,v”,ut,...,u € %”. Clearly, by suitable compositions of the operations G} and 


their inverses, we can define isomorphisms between tensor spaces of the same total order. For 
example, if p,andq, are another pair of integers such that 


Pi+q=Pt+q 
then 7,’(¥) is isomorphic to 7," () by the isomorphism 


OREA ETZAIO (35.12) 


In particular, if q, =0, p, = p +q , then for any Ac J” (V) the tensor (G J GA EFS) 
is called the pure contravariant representation of A. For example, if A is given by (35.8), then 
its pure contravariant representation is given by 
(G) GPA =v, 8--@v, @Gu' ®---@Gu’ 
We shall now express the isomorphism G? in component form. If A € 7,” (% ) has the 


component representation 


Asan” „e OE, Qet @---@e" (35.13) 


then from (35.9) we obtain 


GiA=A'", | Ge, ®---@Ge, Qe ®--.@e" (35.14) 


1 q 


This result can be written as 


GA =A", € @--@E Bei 0- Qe (35.15) 


iq 
where {€} is a basis in Y“ reciprocal to fe’) Le., 


E-e = 6) (35.16) 
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since it follows from (19.1), (12.6), and (35.1) that we have 


@ =Ge, =e e (35.17) 


where 


e =e e, (35.18) 


Substituting (35.17) into (35.15), we obtain 


GA =4Ą 


stl Jie 


ie Q- Qe’ Ge! @.--@e" (35.19) 


where 


Asp eye AS pe (35.20) 
This equation illustrates the component form of the isomorphism G} . It has the effect of 
lowering the first p superscripts on the components jai iat i . Thus A € Z%?(¥ )and 

GA E€ Ip (% ) have the same components if the bases in (35.13) and (35.15) are used. On the 
other hand, if the usual product basis for 7,’(%) and 7,,,(¥)are used, as in (35.13) and 
(35.19), the components of A and G?A are related by (35.20). 


Since G? is an isomorphism, its inverse exists and is a linear transformation from 


TF,.(%) and ZP (Y). If Ae Z,,,(V%) has the component representation 


p+q 


A=A, pipe Be @--- Be" (35.21) 


then from (35.11) 


(GEJ A= 4, Gre! @---@G"e! Oei @-..Be! (35.22) 


q 
By the same argument as before, this formula can be rewritten as 


(G? J'a = Ay ip food ei QQE” Qet Q---Qe (35.23) 


q 
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where {e} is a basis of ¥ reciprocal to {e,} , or equivalently as 


(G?) Asat 


Ig 
where 


iip ik ikp 
A ee e Be sje 


Here of course [ e’ ] is the inverse matrix of Le; | and is given by 
eÏ =e'-e! 
since 


= “gy ik 
e=-Ge =e"e, 


Le O-- Be, @e! ®.--@e" 
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(35.24) 


(35.25) 


(35.26) 


(35.27) 


Equation (35.25) illustrates the component form of the isomorphism (G? E It has the effect of 


raising the first p subscripts on the components [Aa E ) : 


Combining (35.20) and (35.25), we see that the component form of the isomorphism 


(G? ie oG? in (35.12) is given by raising the first p, — p subscripts of Ac Z? (V) if p,> p, or 


by lowering the last p-— p, superscripts of A if p> p,. Thusif A has the component form 


(35.13), then (G? ) GPA has the component form 


(G2) GA =a 


h-j 


where 


-ip ES inuk TEE 
A E k ert. e"? jf p>p 


Jija, ky-kp, -pii jg, 
and 


Ry oes P fae 
Jie Ja Jp-p, +1 Ja, kj 


j j 
_ &, 88e Se ®---@e% 
q PL 


(35.28) 


(35.29) 


(35.30) 
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For example, if A € 7,'(%) has the representation 
A=A',e,@e! @e' 


then the covariant representation of A is 


GA = A' ,& Qe! Qe =A',c,e Qel Qe = Ape Be! Be" 
the contravariant representation of A is 


(G') G,A = A'e, QE B7" = A'pe'e"e, Be, Be, = A'e, Ge, Be, 
and the representation of Ain Z’ (¥ ) is 
(G? F G,A Z A’ ie, QE @e = A yee, 8e, Qe = Ale Be, Set 


etc. These formulas follow from (35.20), (35.24), and (35.29). 


Of course, we can apply the operations of lowering and raising to indices at any position, 
e.g., we can define a “component” A’, °""" for Ac F(Y )by 
2 Ibo r 


At, bei = A etle’... g'an.. (35.31) 


a Ieee T oh 


However, for simplicity, we have not yet assigned any symbol to such representations whose 
“components” have an irregular arrangement of superscripts and subscripts. In our notation for 


the component representation of a tensor A € 7, k (% ) the contravariant superscripts always 


come first, so that the components of A are written as A’”” jj, 28 Shown in (35.13), not as 


i.i ee f : F 
A; ° or as any other rearrangement of positions of the superscripts i,...i, and the 
subscripts j,... j,, such as the one defined by (35.31). In order to indicate precisely the position 
of the contravariant indices and the covariant indices in the irregular component form, we may 
use, for example, the notation 


VOV'OV®@:-@OV®@+-@V'®@: (35.32) 
Q 8) B (a) (b) 


for the tensor space whose elements have components of the form on the left-hand side of 
(35.31), where the order of Y and Win (35.32) are the same as those of the contravariant and 
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the covariant indices, respectively, in the component form. In particular, the simple notation 
J,’ (Y) now corresponds to 


VO- OVOLO- OV (35.33) 
(1) (p) (p+) (p+a) 


Since the irregular tensor spaces, such as the one in (35.32), are not convenient to use, we shall 
avoid them as much as possible, and we shall not bother to generalize the notation G} to 


isomorphisms from the irregular tensor spaces to their corresponding pure covariant 
representations. 


So far, we have generalized the isomorphism G for vectors to the isomorphisms G} for 
tensors of type ( p, q) in general. Equation (35.1) for G , however, contains more information 


than just the fact that G is an isomorphism. If we read that equation reversely, we see that G can 
be used to compute the inner product on% . Indeed, we can rewrite that equation as 


v-u=(Gv,u)= (G'v,u) (35.34) 
This idea can be generalized easily to tensors. For example, if A and Bare tensors in J” (¥ ), 
then we can define an inner product A-B by 
A-B= (G'A,B) (35.35) 


We leave the proof to the reader that (35.35) actually defines an inner product 7” (% ) . By use 
of (34.21) and (35.20), it is possible to write (35.35) in the component form 


A-B= A, eo seit B, i (35.36) 
or, equivalently, in the forms 
Ay Bi 
A-B=; At ern B, (35.37) 
Ab B} E etc. 


Equation (35.37) suggests definitions of inner products for other tensor spaces besides 7” (x ) : 
If Aand Bare in Z? (7), we define A-B by 
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A-B= (Gr je GPA; (GP j G'B = (cra, (c=) G/B) (35.38) 


Again, we leave it as an exercise to the reader to establish that (35.38) does define an inner 
product on 7,” (x ) ; moreover, the inner product can be written in component form by (35.37) 
(35.37) also. 


In section 32, we pointed out that if we agree to use a particular inner product, then the 
isomorphism G can be regarded as canonical. The formulas of this section clearly indicate the 
desirability of this procedure if for no other reason than notational simplicity. Thus, from this 
point on, we shall identify the dual space“ with Y , i.e., 


VZV” 


by suppressing the symbol G . Then in view of (35.2) and (35.7), we shall suppress the symbol 
G? also. Thus, we shall identify all tensor spaces of the same total order, i.e., we write 


FUP) EI PAA) (35.39) 


By this procedure we can replace scalar products throughout our formulas by inner products 
according to the formulas (31.9) and (35.38). 


The identification (35.39) means that, as long as the total order of a tensor is given, it is 
no longer necessary to specify separately the contravariant order and the covariant order. These 
separate orders will arise only when we select a particular component representation of the 
tensor. For example, if A is of total orderr , then we can express A by the following different 
component forms: 


A=A''e, @---@e, 


2 ae e; Q- 8e, Qet 
È p (35.40) 


=A 88e} 


Jr 


where the placement of the indices indicates that the first form is the pure contravariant 
representation, the last form is the pure covariant representation, while the intermediate forms 
are various mixed tensor representations, all having the same total order r. Of course, the 
various representations in (35.40) are related by the formulas (35.20), (35.25), (35.29), and 
(35.30). In fact, if (35.31) is used, we can even represent A by an irregular component form 
such as 


A=A pi e, Qet Ge, 08e, @---@er®.-- (35.41) 


vee Jb 
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provided that the total order is unchanged. 


The contraction operator defined in Section 34 can now be rewritten in a form more 
convenient for tensors defined on an inner product space. If A is a simple tensor of (total) 
orderr,r > 2 , with the representation 


A=V, ®:::®v, 
then C,A, where 1<i< j <r, is a simple tensor of (total) order r —2 defined by (34.10), 


C,A=(v,-V,)V¥, Q Vev Ov (35.42) 


By linearity, C, can be extended to all tensors of orderr. If Ac J; (% ) has the representation 


A=A, ,e'@--@e" 
then by (35.42) and (35.26) 


ij 


=O A, pe BBB Belk (35.43) 


can be written 


C=C,,0C,,0---0€ (35.44) 


1(r+1) 
Also, if A and Bare in 7,(V), it is easily establish that [cf. (34.17)] 


A-B=C(A®B) (35.45) 
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In closing this chapter, it is convenient for later use to record certain formulas here. The 
identity automorphism I of Y corresponds to a tensor of order 2. Its pure covariant 
representation is simply the inner product 


I(u,v)=u-v (35.46) 


The tensor I can be represented in any of the following forms: 


I=e,e Qe’ =e’e, @e, = de, Qe'l =d/e' @e, (35.47) 


There is but one contraction for I, namely 


C,I=trl=s'=N (35.48) 


In view of (35.46), the identity tensor is also called the metric tensor of the inner product space. 


Exercises 


35.1 Show that (35.45) defines an inner product on 7,(¥) 
35.2 Show that the formula (35.38) can be rewritten as 


A-B=((G})'T G/A,B) (35.49) 


Pq q 


where T,, is the generalized transpose operator defined in Exercises 33.8 and 34.3. In 
particular, if A and B are second order tensors, then (35.49) reduces to the more familiar 
formula: 


A-B=tr(A’B) (35.50) 


35.3 Show that the linear transformation 


G:7°(V)>F,(7) 


defined by (35.2) can also be characterized by 


(G’A)(u,,....u,) =A(Gu,,...,Gu, ) (35.51) 
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forall Ac J” (V )and all u,,...,u, €% . 
35.4 Show that if A is a second-order tensor, then 


A=C,,(I@A) 


What is the component form of this identity? 
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Chapter 8 


EXTERIOR ALGEBRA 


The purpose of this chapter is to formulate enough machinery to define the determinant of 
an endomorphism in a component free fashion, to introduce the concept of an orientation for a 
vector space, and to establish a certain isomorphism which generalizes the classical operation of 
vector product to a N-dimensional vector space. For simplicity, we shall assume throughout this 
chapter that the vector space, and thus its associated tensor spaces, are equipped with an inner 
product. Further, for definiteness, we shall use only the pure covariant representation; 
transformations into other representations shall be explained in Section 42. 


Section 36 Skew-Symmetric Tensors and Symmetric Tensors 


If Ac Z(VY) and o isa given permutation of {ese r}, then we can define a new tensor 
TAeZ(¥) by the formula 


T,A(V,,-5V,) = ACV o Vo) (36.1) 


for all v,,..,v, €% . For example, the generalized transpose operation T,, defined by (33.36) in 


Exercise 33.8 is a special case of T, with o given by 


Naturally, we call T_A the ø -transpose of A for any o in general. We have obtained the 
components of the transpose T, in Exercise 33.8. For an arbitrary permutation ø the components 


of T_A are related to those of A by 


(TA), =A (36.2) 


oa) lor) 
Using the same argument as that of Exercise 34.3, we can characterize T, by the condition that 


T(v'@--@v')=v7 %@---@v7 (36.3) 


for all simple tensors v' &---8 v” e (V) 
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If TA =A forall permutations o A is said to be (comp1etely) symmetric. On the other 
hand if T,A = £A for all permutations øo , where ¢, denotes the parity of o defined in Section 


20, then A is said to be (completely) skew-symmetric. For example, the identity tensor I given by 
(35.30) is a symmetric second-order tensor, while the tensor u ® v — v @u, for any vectors 


u,v €¥ , is clearly a skew-symmetric second-order tensor. We shall denote by F (VY) the set of 
all skew-symmetric tensors in Z7(W). We leave it to the reader to establish the fact that F (VY) is 


a subspace of 7(W). Elements of IV ) are often called r-vectors or r-forms. 


Theorem 36.1. An element A € GF (WY) assumes the value 0 if any two of its variables coincide. 
Proof: We wish to establish that 
A(V,,..., V5.2) Vye V) =0 (36.4) 
This result is a special case of the formula 
A (Veo Voes Voes Vp) = ACV}, 0005 Voe Wegner VJ =O (36.5) 


which follows by the fact that £, =—1 for the permutation which switches the pair of indices (s,t)) 
while leaving the remaining indices unchanged. If we take v =v, = v, in (36.5), then (36.4) 
follows. 


Corollary. An element A € T (WY) assumes the value zero if it is evaluated on a linearly 
dependent set of vectors. 


This corollary generalizes the result of the preceding theorem but is itself also a direct 
consequence of that theorem, for if ae v,} is a linearly dependent set, then at least one of the 


vectors can be expressed as is a linear combination of the remaining ones. From the r-linearity of 
A, A(v,,...,V,) can then be written as the linear combination of quantities which are all equal to 


zero because which are the values of A at arguments having at least two equal variables. 
Corollary. If .r is greater than N , the dimension of Y , then 7(¥) ={0}. 


This corollary follows from the last corollary and the fact that every set of more than N 
vectors in a N- dimensional space is linearly dependent. 
Exercises 


36.1 Show that the set of symmetric tensors of order r forms a subspace of 7(V). 
36.2 Show that 
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To = T,T, 


for all permutations ø and z. Also, show that T, is the identity automorphism of 7(V) if 
and only if o is the identity permutation. 
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Section 37 The Skew-Symmetric Operator 


In this section we shall construct a projection from %(¥%) into 7(¥). This projection is 
called the skew-symmetric operator. If Ac 7(W), we define the skew-symmetric projection K,A 
of A by 


1 
K A= TARTA (37.1) 
where the summation is taken over all permutations o of {1, ac r} . The endomorphism 
K.:F(V%)> FY) (37.2) 


defined in this way is called the skew-symmetric operator. 


Before showing that K, has the desired properties, we give one example first. For simplicity, let 


us choose r= 2. Then there are only two permutations of {1, 2} , hamely 


f s] 3 4 
o= >, 0= (37.3) 
1 2 2 1 


€, =1 and ¢,=-1 (37.4) 


and their parities are 


Substituting (37.3) and (37.4) into (37.1), we get 
(K,A)(v,,V>) = (Av. v,)+A(v,,V,)) (37.5) 
for all Ac Z(Y) and v,,v, €% . In particular, if A is skew-symmetric, namely 
A(v,,V,) =—A(v,,v,) (37.6) 
then (37.5) reduces to 
(K,A)(v,,v,) = A(V,, V,) 


or, equivalently, 
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K,A =A, Ac Z(Y) (37.7) 
Since from (37.5), K,A € F(Y) for any A e Z,(V) ????, by (37.7) we then have 
K,(K,A) =K,A 
or, equivalently 
K? =K 


7 =K, (37.8) 


which means that K, is a projection. Hence, for second order tensors, K, has the desired 
properties. We shall now prove the same for tensors in general. 


Theorem 37.1. Let K,: 7(Y%) > J (V) be defined by (37.1). Then the range of K, is FV), 
namely 


R{K,)=F.(Y) (37.9) 
Moreover, the restriction of K, on IV ) is the identity automorphism of A (VY), i.e., 
K A=A, Ac F(Y) (37.10) 


Proof. We prove the equation (37.10) first. If A € F (7), then by definition we have 


TA=£A (37.11) 


for all permutations ø. Substituting (37.11) into (37.1), we get 
Bey onal 
KA=—)'eA=—(r!I)A=A (37.12) 
r!* r! 


Here we have used the familiar fact that there are a total of r! permutations for r numbers 


fary. 
Having proved (37.10), we can conclude immediately that 


R(K,)> F(Y) (37.13) 


since T (VY) is a subspace of 7(%). Hence, to complete the proof it suffices to show that 


252 Chap. 8 ° EXTERIOR ALGEBRA 


RK,) c ÊO) 
This condition means that 
T,(K,A) = £,K,A 
forall Ac Z(Y). From (37.1), T.(K,A) is given by 
1 
T.(K,A) = peels (T,A) 


From the result of Exercise 36.2, we can rewrite the preceding equation as 


Oo TO 


1 
TKA 2 TA 


Since the set of all permutations form a group, and since 


the right-band side of (37.15) is equal to 
Es X EE lA 
ri 
or, equivalently, 
zy) TA 
é, r! 7 Ero TO 


which is simply another way of writing ¢,K,A, so (37.14) is proved. 
By exactly the same argument leading to (37.14) we can prove also that 


K (T.A)=6K,A 


(37.14) 


(37.15) 


(37.16) 


Hence K, and T, commute for all permutations 7 . Also, (37.9) and (37.10) now imply that 


K? =K., for all r. Hence we have shown that K, is a projection from 7(Y) to P(Y) . From a 


result for projections in general [cf. equation (17.13)], Z(Y%) can be decomposed into the direct 


sum 
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IV) = F(V)® KK, ) (37.17) 
where K(K,) is the: kernel of K, and is characterized by the following theorem. 


Theorem 37.2. The kernel K(K,.) is generated by the set of simple tensors v' ®---@v" having at 


r 


least one pair of equal vectors among the vectors v‘,..., V”. 


Proof. : Since Z(¥) is generated by simple tensors, it suffices to show that the difference 
v'®--@v'-K,(v'®@---@v') (37.18) 


can be expressed as a linear combination of simple tensors having the prescribed property. From 
(36.3) and (37.1), the difference (37.18) can be written as 


=y(v @-- Ov —E,v OvO) (37.19) 


We claim that each sum of the form 
vE Ov —E v7 @--@v (37.20) 


can be expressed as a sum of simple tensors each having at least two equal vectors among the r 
vectors forming the tensor product. This fact is more or less obvious since in Section 20 we have 
mentioned that every permutation o can be decomposed into a product of permutations each 
switching only one pair of indices, say 


o = 0,0,1 00, (37.21) 
k“ k-1 271 


where the number k is even or odd corresponding to o being an even or odd permutation, 
respectively. Using the decomposition (37.21), we can rewrite (37.20) in the form 


+(v'@---@v' + v2 B+ By) 


-(v7® ® aii ® youn + ya ® yah ® vr) 


pies (37.22) 


g (va B. @yArAO 4 yO @...@y"™) 
oO 


where all the intermediate terms v7" @---@v7""" | j =1,...,k —1 cancel in the sum. Since 


each of o,,...,0, Switches only one pair of indices, a typical term in the sum (37.22) has the form 
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V7 @- ysyt OVL ty? yi yn OV 
which can be combined into three simple tensors: 
v! ®---(u)---(u)---@v? 
where u=v’,v., and v’+v’. Consequently, (37.22) is a sum of simple tensors having the 


property prescribed by the theorem. Of course, from (37.16) all those simple tensors belong to the 
kernel K(K,). The proof is complete. 


In view of the decomposition (37.17), the subspace KA (VY) is isomorphic to the factor space 
T(V )/K(K,). In fact, some authors use this structure to define the space F (VY) abstractly 
without making F (WY) a subspace of 7(W). The preceding theorem shows that this abstract 


definition of F (VY) is equivalent to ours. 
The next theorem gives a useful property of the skew-symmetric operator K, . 


Theorem 37.3. If Ac 7,(Y) and Be 7,(7), then 


K,.g(A@B)=K,,,(A@K,B 


p+q 


(37.23) 


Proof. Let t be an arbitrary permutation of {1,...,q}. We define 


1 2>: p p+) : > p+r) 


Then £, =€, and 


AQT,B =T, (AQB) 


Hence from (37.14) we have 
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K,,,(A®@T.B) =K,,,(T, (A®B))=¢,K,,,(A®@B) 
=eK,,,(A @B) 


T pq 


or, equivalently, 


K,„(A8<,T,B)=K,,(A®B) (87.24) 


p+q ( pt+q ( 


Summing (37.24) over all 7 , we obtain (37.23);, A similar argument implies (37.23). 


In closing this section, we state without proof an expression for the skew-symmetric 
operator in terms of the components of its tensor argument. The formula is 


K,A=— 1 pie “A 0" @---@e! (37.25) 
yl Ls der 


We leave the proof of this formula as an exercise to the reader. A classical notation for the 


components of K,A is A,__,,, $0 from (37.25) 


Í gii 
Ae Oy pe (37.26) 


Naturally, we call K,A the skew-symmetric part of A. The formula (37.25) and the quotient 
theorem mentioned in Exercise 33.7 imply that the component formula (33.32) defines a tensor K, 
of order 2r, a fact proved in Section 33 by means of the transformation law. 


Exercises 


37.1 Let Ac Z(V%) and define an endomorphism S, of 7(%) by 
SA= LS A 
ls aai r! - o 


where the summation is taken over all permutations o of ie r} as in (37.1). Naturally 


S, is called the symmetric operator. Show that it is a projection from 7 (VY) into the 
subspace consisting of completely symmetric tensors of order r. What is the kernel 
K(S,)? A classical notation for the components of S,A is A, ;)- 

37.2 Prove the formula (37.25). 
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Section 38. The Wedge Product 


In Section 33 we defined the concept of the tensor product © , first for vectors, then 
generalized to tensors. We pointed out that the tensor product has the important universal 
factorization property. In this section we shall define a similar operation, called the wedge product 
(or the exterior product), which we shall denote by the symbol ^. We shall define first the wedge 
product of any set of vectors. 


If (vis v') is any r-triple of vectors, then their tensor product v' &---® v” is a simple 
tensor in 7(Y) [cf. equation (33.10)]. In the preceding section, we have introduced the skew- 


symmetric operator K,, which is a projection from 7(¥) onto IV ). We now define 


VA AV =A(v',...V') = riK, (v'@---@v') (38.1) 


for any vectors v’,...,v’. For example, if r = 2 , from (37.5) for the special case that A = v' Q v 
we have 


v AV =v Ov -v @v (38.2) 


In general, from (37.1) and (36.3) we have 


VA AV = SeT, (v'®---@v') 
=>, (vr @---@v7') (38.3) 


= ee: (ve Q-Q vo) 
where in deriving (38.3), we have used the fact that 


Es =E (38.4) 


and the fact that the summation is taken over all permutations o of T r} : 
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We can regard (38.1), as the definition of the operation 


AVX XV > FV) (38.5) 


r times 


which is called the operation of wedge product. From (38.1)s it is clear that this operation, like the 
tensor product, is multilinear; further, ^ is a (completely) skew-symmetric operation in the sense 
that 


AW OVO) SECA ses V) (38.6) 


for all v',...,v’ €% and all permutations o of {1,...,r}. This fact follows directly from the skew 


symmetry of K, [cf. (37.16)]. Next we show that the wedge product has also a universal 
factorization property which is the condition asserted by the following. 


Theorem 38.1. If W is an arbitrary completely skew-symmetric multilinear transformation 


WV x: x¥ YU (38.7) 
eS 


r times 


where ®% is an arbitrary vector space, then there exists a unique linear transformation 
D:7(V)>4¥u (38.8) 
such that 
W(v',..uv')=D(via--av') (38.9) 
for all v',...,v’ €% . In operator form, (38.9) means that 


W=Doa (38.10) 


Proof We can use the universal factorization property of the tensor product (cf. Theorem 34.1) to 
decompose W by 
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W=Co® (38.11) 
where C is a linear transformation from 7(Y) to X, 
C:F7(V)> U (38.12) 


Now in view of the fact that W is skew-symmetric, we see that the particular linear transformation 
C has the property 


Civ" @---@v"™)=2,C(v'®--Ov') (38.13) 


for all simple tensors v' @---@v' e Z(Y) and all permutations o of {1,...,r}. Consequently, if 


we multiply (38.13) by ¢, and sum the result over all o , then from the linearity of C and the 
definition (38.3) we have 


C(v'a---Av')=riC(v'®---@v') 


(38.14) 
= rW (v, v") 
for all v‘,...,v" €% . Thus the desired linear transformation D is simply given by 
D= : 38.15 
pl $o (38.15) 


where the symbol on the right-hand denotes the restriction of C on Í (YW) as usual. 


Uniqueness of D can be proved in exactly the same way as in the proof of Theorem 34.1. 
Here we need the fact that the tensors of the form v' A---Av' € F (WY), which may be called 


simply skew-symmetric tensors for an obvious reason, generate the space Í (WY). This fact is a 
direct consequence of the following results: 


(i) T(V) is generated by the set of all simple tensors v'®---®v'. 


(ii) K, is a linear trans formation from 7(¥) onto Í (VY). 
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iii Equation (38.1) which defines the simple skew-symmetric tensors v! A^A:-- ^v”. 
(iii) q (38.1) p y 


Knowing that the simple skew-symmetric tensors v' ^-^ v” form a generating set of 
Í (WY), we can conclude immediately that the linear transformation D is unique, since its values 
on the generating set are uniquely determined by W through the basic condition (38.9). 


As remarked before, the universal factorization property can be used to define the tensor 
product abstractly. The preceding theorem shows that the same applies to the wedge product. In 


fact, by following this abstract approach, one can define the vector space A (VY) entirely 
independent of the vector space Z(¥) and the operation ^ entirely independent of the operation 
&. 


Having defined the wedge product for vectors, we can generalize the operation easily to 
skew-symmetric tensors. If A € F ) and Be 7 (VY), then we define 


ane-(" Kaew (38.16) 
P 
where, as before, 
r=p+q (38.17) 

and 

r —1):---(r— I 

[£ 1)---(r p+) _ r! (38.18) 

P p! p!q! 


We can regard (38.16) as the definition of the wedge product from IV )x A (y) > Í (VY), 


MI,(V)XFI(V) > Y) (38.19) 


Clearly, this operation is bilinear and is characterized by the condition that 


260 Chap. 8 ° EXTERIOR ALGEBRA 


(vaea VPA aean Nv AAW aaau (38.20) 


for all simple skew-symmetric tensors v' ^: ^V’ € ACA ) and u^- au? E 7 (VY). We leave 


the proof of this simple fact as an exercise to the reader. In component form, the wedge product 
A AB is given by 


1 


AAB= ign 


eiee} (38.21) 


vip ipyrest, 
relative to the product basis of any basis fe \ for V . 


In view of (38.20), we can generalize the wedge product further to an arbitrary member of 
skew-symmetric tensors. For example, if A € GF (VY), Be KA (VY) and Ce KA (YW), then we have 


(AA B)AC=AA(BAC)=AABAC (38.22) 
Moreover, A ABA C is also given by 


(a+b+c)! 
a!b!c! 


AABAC= A®B®@C) (38.23) 


(a+b+c) ( 


Further, the wedge product is multilinear in its arguments and is characterized by the associative 
law such as 


(vaavi )a(u Av Aw )A(wia--Aw') 


(38.24) 


c 


=V A AV AWA AUP AW AAW 


for all simple tensors involved. 


From (38.24), the skew symmetry of the wedge product for vectors [cf. (38.6)] can be 
generalized to the condition such as 


BAA=(-1)"AAB (38.25) 
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forall A € IV ) and Be IV ). In particular, if A is of odd order, then 


AvAA=0 


since from (38.25) if the order p of A is odd, then 
AAA=(-1)” AAA=-AAA 
A special case of (38.26) is the elementary result that 


vVAv=0 


which is also obvious from (38.2). 


Exercises 


38.1 Verify (38.6), (38.20), (38.22),, and (38.23). 
38.2 Show that (38.23) can be rewritten as 


1 ue i i 
VA AV =Â V 8: Ov" 


where the repeated indices are summed from 1 to r. 
38.3 Show that 


vu, 


VA: AV'(u,,...,U,) = det 


for all vectors involved. 


(38.26) 


(38.27) 


(38.28) 


(38.29) 


(38.30) 
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38.4 Show that 


VW A Ae (eee) = Or (38.31) 


= ior 


for any reciprocal bases {e'} and {e,}. 
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Section 39. Product Bases and Strict Components 


In the preceding section we have remarked that the space F (VY) is generated by the set of 


all simple skew-symmetric tensors of the form v' A--- Av’. This generating set is linearly 
dependent, of course. In this section we shall determine a linearly independent generating set and 


thus a basis for KA (VY) consisting entirely of simple skew-symmetric tensors . Naturally, we call 


such a basis a basis a product basis for IY ). 


Let {e'} be a basis for Y as usual. Then the simple tensors fe' ®---® e) form a product 


basis for 7(W). Since ACA ) is a subspace of 7(W), every element A e IV ) has the 
representation 


A=A, ,e'®---@e" (39.1) 


where the repeated indices are summed from 1 to N , the dimension of 7(%). Now, since 


Ae Í (VW), it is invariant under the skew-symmetric operator K, , namely 
A=K,A=A,_,K,(e' ®---@e") (39.2) 
Then from (38.1) we can rewrite the representation (39.1) as 
1 i i 
A= Ae A Ae (39.3) 
pee 
Thus we have shown that the set of simple skew-symmetric tensors fe AA e) already forms a 
generating set for Í (r). 


The generating set fe' Art e) is still not linearly independent, however. This fact is 
easily seen, since the wedge product is skew-symmetric, as shown by (38.6). Indeed, if i, and i, 


are equal, then et --- Ae" must vanish. In general if o is any permutation of {1,...,r}, then 


e7® A---Ae is linearly related to et a- Ae" by 
er A Ae =Ee A. Ae" (39.4) 


Hence if we eliminate the redundant elements of the generating set fe' Art e) by restricting the 


range of the indices (i,...,i,) in such a way that 
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|<< <i (39.5) 


the resulting subset fet aene, i << i,} remains a generating set of 7(Y). The next theorem 


shows that this subset is linearly independent and thus a basis for TF (WY), called the product basis. 


Theorem 39.1. The set fet Ar ARI <i < i,} is linearly independent. 
Proof. Suppose that the set fe' Ar AQ yi <i < i,} obeys the homogeneous linear equation 


>) Get acaeé (39.6) 


i<--<i, 


where {c |<< i,} are scalars, and where the summation is taken over all indices i,,...,i, 


from 1 to N subject to the condition (39.5). Then we must show that C, vanishes completely. 
From (38.31), if we evaluate the tensor (39.6) at the argument (€, sons e, ), where fe i denotes the 


reciprocal basis of {e'} as usual, we get 


YC 20 =0 (39.7) 


<<, 


for all j,,...,j, ranging from 1 to N . In particular, if we choose j, <---< j,, then the summation 
reduces to only one term, namely C, , and the equation yields 


which is the desired result. 


N 
It is easy to see that there are only ) number of elements in the product basis 
r 


fe aaeh < <i} (39.8) 
Thus we have 


N! 


~ r\(N-r)! on 


x N 
dim F007) -[? 


In particular, we recover the corollary of Theorem 36.1, that is 
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dim 7(Y) =0 
ifr>N. 


Returning now to the representation (39.3) for an arbitrary skew-symmetric tensor 
Ae F (VY), we can rewrite that representation as 


A= > A jeta---ae! (39.10) 


ip<-<i, 


The reason that we can replace the full summation in (39.3) by the restricted summation is because 
for each increasing r-tuple (i,,...,i,) there are precisely r! permutations of figos ts Further, their 


corresponding r! terms in the full summation (39.3) are all equal to one another, since both A, ; 
and et a++- [aet are completely skew-symmetric in the indices (i,,...,i,), so for any permutation o 


of {i,,...,i,} we have 


fo) A. AQ = £2 bes Regal PEN 
O- Ke ROO = EA e AN Re SA. pe Kee 


Holy 


In view of (39.10), we see that the scalars 


are the components of A relative to the product basis (39.8). For definiteness, we call these 
scalars the strict components of A. 


As an illustration of this concept, let us compute the strict components of the simple skew- 
symmetric tensor v' A---Av' €e (WY). As usual we represent the vector v' in component form 


relative to fe’) by 
vi =vie! 


for all i=1,...,r. Using the skew-symmetry of the wedge product, we have 


VA AV =v, viet As ae” = X Vi OV Oe aeae" (39.11) 


i,<-<i, 
which means that the strict components of v' A--- Av" are 


Ty” Sieh jG a] 
fv Vi O sh E = i,} 
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Next, we consider the transformation rule for the strict components in general. 


Theorem 39.2. Under a change of basis from fe \ to G \ , the strict components of a tensor 


Ae Í (Y) obey the transformation rule 


Awa = TAL (39.12) 


where T;' is an rxr minor of the transformation matrix 7; | as defined by (21.21). Of course, 


T; is given by 


kiai 
T =e e; 


as usual [cf. (31.21)], where fê a is the reciprocal basis of fè \ : 


Proof. Since the strict components are nothing but the ordinary tensor components restricted to the 
subset of indices in increasing order [cf. (39.5)], we can compute their values in the usual way by 


Â, =AG, 5-8; ) (39.13) 


hjr 


Substituting the strict component representation (39.10) into (39.13) and making use of the formula 
(38.30) we obtain 


= i eee ifa a 
A = Aue Ave (ê ê; ) 


i<-<i, 
[ ei . e . i . e" . e ] 
j Jr 
= > A, ; det 
Las 
i<-<i, 
eê . e. ê 
j Jr 


which is the desired result. 


From (21.25), we can write the transformation rule (39.12) in the form 


A,_, =det[ Ty] X (cof tyt) A (39.14) 


<<, 


As alt illustration of (39.12), take r= N . Then (39.9) implies 
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dim f (y )=1 (39.15) 
And (39.12) reduces to 
Aes = det| T; |.» (39.16) 


Comparing (39.16) with (33.15), we see that the strict component of an N-vector transforms 
according to the rule of an axial scalar of weight 1 [cf. (33.35)]. N-vectors are also called densities 
or density tensors. 
Next take r= N —1. Then (39.9) implies that 
dim f, (Y)=N (39.17) 

In this case the transformation rule (39.12) can be rewritten as 

A = det|T; |TMA' (39.18) 
where the quantities A‘ and A! are defined by 

At = (A, ; y (39.19) 
and similarly 


A = (DTA, py (39.20) 


Here the symbol © over k or l means k or | are deleted from the list of indices as before. To 
prove (39.18), we make use of the alternative form (39.14), obtaining 


ASEN =det [Ts] 2 (cof Tien VAs 
Multiplying this equation by (—1)*“ and using the definitions (39.19) and (39.20), we get 


12...1...N 


Ak = det[ T; |$ D" (cof DA 
l 


712...k...N 
T 


12...1...N in the 


But now from (21.23), it is easy to see that the cofactor of the (N —1)(N —1) minor 


k+l 


matrix [i] is simply (—1)"” times the element Ts Thus (39.18) is proved. 
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From (39.19) or (39.20) we see that the quantities A* and A’, like the strict components 
A, x. y and A, z_ , characterize the tensor A completely. In view of the transformation rule 


(39.18), we see that the quantity A’ transforms according to the rule of the component of an axial 
vector of weight 1 [cf. (33.35)]. (N —1) vectors are often called (axial) vector densities. 


The operation given by (39.19) and (39.20) can be generalized to FV ) in general. If 
A, İs a strict component of A € %,_,(Y), then we define a quantity A’! by 


hej > i...iy_p ye J 
At Jr etn rit "A, iy 
“e N-r 


i <=<iy-r 


CS. 
 (N-r)! 


(39.21) 


E TD E) 
g! N-rJ1 "A. ; 
Lely —r 


When r =1, (39.21) reduces to (39.20). Recall that £t ™ transforms according to the rule of an 
axial contravariant tensor of order N and weight 1. Moreover, since A!" is skew-symmetric in 
(j= J.) » we can write the transformation rule as 


Alehadet| Ts |) X, Tear (39.22) 


where es is an rxr minor of the transformation matrix Es ir When r =1, (39.22) reduces to 
(39.18). 


As an illustration of (39.20), we take N =3; then (39.20) yields 
A = A», A? =-A,, A? =A, (39.23) 
As we shall see, this operation is closely related to the classical operation of cross product (or 
vector product) which assigns an axial vector to a skew- symmetric two-vector on a three- 


dimensional space. 


Before closing this section, we note here a convenient condition for determining whether or 
not a given set of vectors is linearly dependent by examining their wedge product. 


Theorem 39.3. A set of r vectors }v’,...,v’; is linearly dependent if and only if their wedge 
y dep 


product vanishes, i.e., 
VA: Av =0 (39.24) 


Proof. Necessity is obvious, since if Lae v} is linearly dependent, then at least one of the 


vectors can be expressed as a linear combination of the other vectors, say 
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vV =av + +a, v (39.25) 


Then (39.24) follows because 


r-1 
VIAA Vv =J ay ae av av =0 
j=l 


as required by the skew symmetry of the wedge product. Conversely, if Pere v} is linearly 
independent, then it can be extended to a basis {Vee v” \ (cf. Theorem 9.8). From Theorem 39.1, 


the simple skew-symmetric tensor v‘ \---.v” forms a basis for the one-dimensional space 
J, (VY) and thus is nonzero, so that this factor v' \--- Av" is also nonzero. 


Corollary. A set of N covectors here v“ \ is a basis for Y if and only if v'a- av” #0. 


Exercises 
39.1 Show that the product basis of F (VY) satisfies the following transformation rule: 


Warner J THe A. Aer (39.26) 


j<=<j, 


relative to any change of basis from fê 3 to fe i} with transformation matrix [T; | : 


39.2 For the skew-symmetric tensor space GF (VY) it is customary to define the inner produce 
«:FV)xF(V) >A (39.27) 


by requiring the product basis of an orthonormal basis be orthonormal with respect to *. In 
other words, relative to an orthonormal basis ore i” \ the inner produce A *B of any 


A,B € Z(Y) is given by 


A*B= >) AB (39.28) 


i<-<i, 


Show that this inner product is related to the ordinary inner product - on Í (VY), regarded 
as a subspace of 7(V), by 
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A+B=_A-B (39.29) 


forall A,B € Z(Y). 
39.3 Relative to the inner product * , show that 


aau) =det| . ; (39.30) 


for all simple skew- symmetric tensors v! a- av” and u! a- au" in Z(Y). 
p y i 


39.4 Relative to the product basis of any basis {e'} , show that the inner product A *B has the 


representation 
| oth : 2 ; eti | 
A*B= > det) . - 1A Be (39.31) 
į <<i, 
h<<j 
CO 
where e” is given by 
eÏ =e'-e/ (39.32) 


as before [cf. (14.8)]. In particular, if {e'} is orthonormal, then e” = 6" and (39.31) 


reduces to (39.28). 
39.5 Prove the transformation rule (39.22). 
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Section 40. Determinant and Orientation 


In the preceding section we have shown that the strict component of an N-vector over an N- 
dimensional space % transforms according to the rule of an axial scalar of weight 1. For brevity, 


we call an N-vector simply a density or a density tensor. From (39.15), the space TV ) of 


densities on Y is one-dimensional, and for any basis {e'} a product basis for TV ) is 


e'a ae”. If D isa density, then it has the representation 


D=D,, na ae” (40.1) 


12...N 


where the strict component D,, , is given by 
Dy n =D(e,,.-€y) (40.2) 


{e,} being the reciprocal basis of fe \ as usual. The representation (40.1) shows that every 


density is a simple skew-symmetric tensor, e.g., we can represent D by 
D=(D, ye )a ne" (40.3) 


This representation is not unique, of course. 


Now if A is an endomorphism of ¥ , we can define a linear map 


BAAREN A) (40.4) 
by the condition 
f (vi aav" )=(Av')a---a(Av" ) (40.5) 


for all v',...,.v" €% . We can prove the existence and uniqueness of the linear map f by the 
universal factorization property of TV ) as shown by Theorem 38.1. Indeed, we define first the 
skew-symmetric multilinear map 


FiVx-xV > IY) (40.6) 
by 


F(v',...v")=(Av')a---a(Av") (40.7) 
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where the skew symmetry and the multilinearity of F are obvious. Then from (38.10) we can 
define a unique linear map f of the form (40.4) such that 


F=foa (40.8) 
which means precisely the condition (40.5). 


Since IV ) is one-dimensional, the linear map f must have the representation 
f (via av") =avin- av" (40.9) 
where æ is a scalar uniquely determined by f and hence A. We claim that 


a=detA (40.10) 


Thus the determinant of A can be defined free of any basis by the condition 


Av') A--»A(Av™ )=(detA)v' aav (40.11) 
(Av') A a(Av") 


forall vt,... v ev. 

To prove (40.10), or, equivalently, (40.11), we choose reciprocal basis {e,} and {e'} for V 
and recall that the determinant of A is given by the determinant of the component matrix of A. 
Let A be represented by the component forms 


Ae, = A’e , Ae’ = Aje’ (40.12) 


where (40.12) is meaningful because Y is an inner product space. Then, by definition, we have 


det A = det| A’, | = det| A; | (40.13) 
Substituting (40.12), into (40.5), we get 
fenaa" )= A A, Yet Ave! (40.14) 
The skew symmetry of the wedge product implies that 
eb Arnel = ethe n-ne (40.15) 


where the ¢ symbol is defined by (20.3). Combining (40.14) and (40.15), and comparing the 
result with (40.9), we see that 
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atn ne” =A! A, Nehmlve! Av Aen (40.16) 
or equivalently 


a= A A, Needs (40.17) 


jt 


since e' A---Ae™ is a basis for FV ). But by definition the right-hand side of (40.17) is equal to 
the determinant of the matrix la; ] and thus (40.10) is proved. 


As an illustration of (40.11), we see that 


detI =1 (40.18) 


since we have 
N 


(Iv') A---A(Iv")=vi asa" =(detI)v' ^- ^v 


More generally, we have 


det(aI) = a” (40.19) 
Since 
(alv') \---A(alv") =(av')a---a(av" ) =a aee nav” 
It is now also obvious that 
det(AB) = (det A)(det B) (40.20) 


since we have 


(ABv') ,---,(ABv" ) = (det A)(Bv') a- -^a (Bv“) 
= (det A)(detB)v' A--- vv" 


for all v',...,v €% . Combining (40.19) and (40.20), we recover also the formula 


det(@A) = a” (det A) (40.21) 
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Another application of (40.11) yields the result that A is non-singular if and only if det A 
is non-zero. To see this result, notice first the simple fact that A is non-singular if and only if A 


transforms any basis fe \ of ¥ into a basis {Aei \ of ¥ . From the corollary of Theorem 39.3, 
{Ae’} is a basis of W if and only if (Ae') Avr A (Ae") #0. Then from (40.11), 
(Ae’) Ar (Ae” ) #0 if and only if detA #0. Thus det A <0 is necessary and sufficient for A 


to be non-singular. 


The determinant, of course), is just one of the invariants of A. In Chapter 6, Section 26, 
we have introduced the set of fundamental invariants { hse, Ly} for A by the equation 


det(A+¢1)=t" + 4t" +--+ My at + My (40.22) 


Since we have shown that the determinant of any endomorphism can be characterized by the 
condition (40.11), if we apply (40.11) to the endomorphism A + tI, the result is 


(Av tiv’) A+ A(Av™ tv" )= (t + ut) $e + fy gt + My )VEA AW 


N-k 


Comparing the coefficient of t“ on the two sides of this equation, we obtain 


LV) Av AVN = 2 vtae AAV! ae AAV? a A AVY aav (40.23) 


i <--<ix 


N 
where there are precisely x terms in the summation on the right hand side of (40.23), each 


containing K factors of A acting on the covectors vt... v*. In particular, taking K = N , we 
recover the result 


MyV Av AVN =(Av')a---a(Av") 
which is the same as (40.11) since 
My =detA 


Likewise, taking K =1, we obtain 


N 
UV Av AVN = ov A AAV Ar Av" (40.24) 


i=1 


which characterizes the trace of A free of any basis, since 
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=A 


Of course, we can prove the existence arid the uniqueness of wv, satisfying the condition (40.23) 
by the universal factorization property as before. 


Since the space IV ) is one-dimensional, it is divided by 0 into two nonzero segments. 
Two nonzero densities D, and D, are in the same segment if they differ by a positive scalar factor, 
say D, = AD,, where 2>0. Conversely, if D, and D, differ by a negative scalar factor, then they 
belong to different segments. If {e'} and fèi) are two basis for Y , we define their corresponding 
product basis e’ \---ve” and é@'a---é” for F (VY) as usual; then we say that {e'} and fē) 
have the same orientation and that the change of basis is a proper transformation if e' \--- Ae” 
and é' ,---é” belong to the same segment of TV ). Conversely, opposite orientation and 
improper transformation are defined by the condition that e' A--- Ae” and é' A---Aé” belong to 
different segments of IV ). From (39.26) for the case r = N , the product basis e \--- Ae” and 


é',--- Ae” are related by 


eaae" =det| T; Jè aae" (40.25) 


Consequently, the change of basis is proper or improper if and only if det| T; | is positive or 


negative, respectively. 


It is conventional to designate one segment of ACA ) positive and the other one negative. 
Whenever such a designation has been made, we say that TY ) and, thus, Y , are oriented. For 
an oriented space ¥ a basis {e'} is positively oriented or right-handed if its product basis 


e' ,---Ae” belongs to the positive segment of TY ); otherwise, the basis is negatively oriented 


or left-handed. It is customary to restrict the choice of basis for an oriented space to positively 
oriented bases only. Under such a restriction, the parity ¢ [cf, (33.35)] of all relative tensors 
always has the value +1, since the transformation is restricted to be proper. Hence, in this case it is 
.not necessary to distinguish relative tensors into axial ones and polar ones. 


As remarked in Exercise 39.2, it is conventional to use the inner product * defined by 
(39.29) for skew-symmetric tensors. For the space of densities FV ) the inner product is given 
by [cf. (39.31)] 


A*B=det e" |A, vB wv (40.26) 
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where A, , and B,, , are the strict component of A and B as defined by (40.2). Clearly, there 


exists an unique unit density with respect to * in each segment of FV ). If IV ) is oriented, 
the unit density in the positive segment is usually denoted by E, then the unit density in the 
negative segment is -E . In general, if D is any density in TV ), then it has the representation 


D=dE (40.27) 


In accordance with the usual practice, the scalar d in (40.27), which is the component of D 
relative to the positive unit density E, is also called a density or more specifically a density scalar, 
as opposed to the term density tensor for D. This convention is consistent with the common 
practice of identifying a scalar œ with an element a1 in #, where & is, of course, an oriented 
one-dimensional space with positive unit element 1. 


If {e'} is a basis in an oriented space Y , then we define the density e* of {e'} to be the 


component of the product basis e! A--- Ae” relative to E , namely 
eane eE (40.28) 


In other words, e* is the density of the product basis of {e'} . Clearly, fe \ is positively oriented or 


negatively oriented depending on whether its density e* is positive or negative, respectively. We 
e"|=1. All 


orthonormal bases, right-handed or left-handed, are always unimodular. A unimodular basis in 
general need not be orthonormal, however. 


say that a basis {e \ is unimodular if its density has unit absolute value. i.e., 


From (40.28) and (40.26), we have 


e” =eCEte'E= det| e" | (40.29) 
for any basis {e'} . Hence the absolute density |e*| of {e'} is given by 
le'|=(det[e" ]) (40.30) 
Substituting (40.30) into (40.28), we have 
ENIN = e(det[e"] ° )E (40.31) 


where ¢ is +1 if {e'} is positively oriented and it is —1 if {e'} is negatively oriented. 


From (40.25) and (40.27) the density of a basis transforms according to the rule 
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e' =det| T', |ê (40.32) 


under a change of basis from fe \ to fè \ . Comparing (40.32) with (33.35), we see that the 


density of a basis is an axial relative scalar of weight —1 . 


An interesting property of a unit density (E or —E) is given by the following theorem. 


Theorem 40.1. If U is a unit density in FV ), then 

| U(v' AAW") || U+ (u aau”) = det| v' u | (40.33) 
for all v',..,v” and u',..,u” in Y. 
Proof. Clearly we can represent U as the product basis of an orthonormal basis, say {i, i. with 


iy (40.34) 


From (40.34) and (39.30), we then have 


U*(v'a---av")=det/i,-v' |=det| v‘, | 


and 


U*(u'a---au”)=det/i,-u' |= det] u', | 


where v', and u', are the jth components of v' and u’ relative to {i,}, respectively. Using the 
product rule and the transpose rule of the determinant, we then obtain 


| U(v' aav” )||U x(u Aan") =det| v', |det| u“, | 
= det[v', ]det[u*, ] = aet([v, [u's] 
= d| Svat, = det| v' u | 
L k=1 


which is the desired result. 


By exactly the same argument, we have also the following theorem. 
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Theorem 40.2. If U is a unit density as before, then 
U(v,,..5Vy)U(u,,....Uy) = det] v; u] (40.35) 


for all v,,...,.V, and u,,...,u, in VY. 


Exercises 


40.1 If A is an endomorphism of W show that the determinant of A can be characterized by 
the following basis-free condition: 


D(Av,,..., AV, ) = (det A) D(v,,..., vy) (40.36) 


for all densities D € %(Y) and all vectors v,,...,vy in Y. 


Note. A complete proof of this result consists of the following two parts: 
(i) There exists a unique scalar @ , depending on A, such that 


D(Av,,..., AV, ) = @D(v,,..., Vy ) 


for all De P(Y) and all vectors v,,...,.V, in ⁄. 
(ii) The scalar @ is equal to detA. 
40.2 Use the formula (40.36) and show that the fundamental invariants {44,..., 4y} of an 
endomorphism A can be characterized by 


HD(Vpe Vy) = $, DV AV, AV, eV | (40.37) 


i <=<ik 


for all v,,....Vy €% andall De TV ). Here the summation on the right-hand side of 
N 
(40.37) is similar to that of (40.23), i.e., there are a terms in the summation, each 


containing the value of D at the argument with K vectors Viesg Vis acted on by A. In 
particular, taking K = N , we recover the formula (40.36), and taking K =1, we obtain 
N 
(tr A)D (v; Vy) = XD (Vp AV Vy) (40.38) 
i=1 


40.3 The Gramian is a function (not multilinear) G of K vectors defined by 


G (Vi Vy)=det| v,-v, | (40.39) 
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where the matrix Ly, -v i is K xK, of course, Use the result of Theorem 40.2 and show 
that 


G(V,,.5Vy) 20 (40.40) 


and that the equality holds if and only if {Viger v a is a linear dependent set. Note. When 


K =2, the result (40.40) reduces to the Schwarz inequality. 
40.4 Use the results of this section and prove that 


det A = det A’ 


for an endomorphism Ae Y(¥;7). 
40.5 Show that adjA, which is used in (26.15), can be defined by 


(adjA)v' Av? A-:-AVv" =v AAV’ A: A Av 


40.6 Use (40.23) and the result in Exercise 40.5 and show that 
My, =tradjA 
40.7 Show that 


adj AB = adjBadjA 
det adj A = (det A)" 
adj(adjA)=(detA)" A 


and 
det adj(adj A) = (det Ae 


for A,Be ¥(¥;V) and N=dimyv. 
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Section 41. Duality 


As a consequence of (39.9), the dimension of the space IV ) is equal to that of FAV ). 


Hence the spaces ACA ) and FV ) are isomorphic. The purpose of this section is to establish a 
particular isomorphism, 


D, : FA > RV) (41.1) 


called the duality operator, for an oriented space Y . As we shall see, this duality operator gives 
rise to a definition to the operation of cross product or vector product when N =3 and r=2. 


We recall first that TF (v) is equipped with the inner product * given by (39.29). Let E be 
the distinguished positive unit density in IY ) as introduced in the preceding section. Then for 
any A € IV ) we define its dual D,A in FV ) by the condition 


E*(AAZ)=(D,A)*Z (41.2) 


forall Ze FV ). Since the left-hand side of (41.2) is linear in Z, and since * is an inner 
product, DA is uniquely determined by (41.2). Further, from (41.2), D,A depends linearly on 


A, so D,, is a linear transformation from IV ) to FAV ). 
Theorem 41.1. The duality operator D, defined by the condition (41.2) is an isomorphism. 
Proof. Since IV ) and FV ) are isomorphic, it suffices to show that D, is an isometry, i.e., 
(D,A)*(D.A)=A*A (41.3) 
forall A € F (¥). The polar identity 
A*xB=~{A#A+B*B-(A-B)*(A-B)} (41.4) 


then implies that D, preserves the inner product also; i.e., 


(D,A)*(D,B)=A*B (41.5) 
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forall A and B in F (Y). To prove (41.3), we choose an arbitrary right-handed orthogonal basis 
{i a of ¥. Then E is simply the product basis of {i J 


E=i ^in (41.6) 


Of course, we represent A, Z , and D_A in strict component form relative to {i i} also; then (41.2) 


can be written as 


> Eis icin ier ieh Z hos ir — > (D,A), Z ince (41.7) 
h <=<i, jh<<jy-r 
h<e<Jn-r 


Since Z is arbitrary, (41.7) implies 


(DA), = » Ei ic hociy E (41.8) 


ip<-~<i,. 


This formula gives the strict components of D,A in terms of those of A relative to a right-handed 
orthonormal basis. 


From (41.8) we can compute the value of left-hand side of (41.3) by 


(D,A) b (D,A) = > Eadj jy Ekk fy og Myc My ok, 


i<-<i, 
ky<-<k, 


= p3 AiAi =A*A 


which is the desired result. 


Having proved that D, is an isomorphism from IV ) to FV ) relative to the inner 
product *, we can now use the condition (41.2) and (41.5) to compute the inverse of D,. Indeed, 
from the skew symmetry of the wedge product [cf. (38.25)] we have 


AAZ=(-1T)!C PZ AA (41.9) 
Hence (41.2) yields 


(D,A)*Z=(-1)' A*(D,_,Z) (41.10) 


forall A € F (VY) and Ze FAV ). On the other hand. (41.5) yields 
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(D,A)*(Z)=A*(D,"Z) (41.11) 

when we chose D,B =Z. Comparing (41.10) with (41.11), we obtain 
D.*=(-1'CD,,., (41.12) 


which is the desired result. 


We notice that the equations (41.8) and (39.21) are very similar to each other, the only 
difference being that (39.21) applies to all bases while (41.8) is restricted to right-handed 
orthonormal bases only. Since the quantities given by (39.21) transform according to the rule of an 


axial tensor density, while the dual D,A is a tensor in FV ), the formula (41.8) is no longer 
valid if the basis is not right-handed and orthogonal, If fe \ is an arbitrary basis, then E is given 
by (40.31). In this case if we represent A, Z, and D,A again by their strict components relative to 


{e'} , then from (39.21) and (40.28) the condition (41.2) can be written as 


i elif oo es & ein- ] 
iiri jN- o* — 
Aa a Bre De, la oo (DA), Zio 41-13) 
i<-<i, k,<<ky_, 
<< jy-r hi<-<in-r 
enh ekw-rin-r 
Since Z is arbitrary, (41.13) implies 
[ ehh : 7 N ein- ] 
` Ae te X det}. : (DA), n, (41.14) 
i<--<i, k,<<ky_, E 
en-hi en-rin-r 


jih Jil-r 


det 


L jn-rh İn-rly-r B) 
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and sum on j,,..., Jy, in increasing order, then we obtain 


e jn A € jir 
E * birji Inv 
(DAJ aom A Aa E det . (41.15) 
i<-<i, 
h<=<jn-r 
Ejga rn Pyles Al 


which is the desired result. In deriving (41.15) we have used the identity (21.26) for the matrix 
l ] and its inverse le: | in order to obtain the formula 


[ ehh 3 . i ein- ] e 


jih Cris 


> det}. .  |det] - eo [Soe (41.16) 


h<<jy-r 


eknrdi Se eh |) ekw-rin-r e. eh e Tae E 
L | L Jnr Jn-r'n-r _] 


Equation (41.15) follows, since we have 


> oi (DA), =(D,A) 


ky<-<ky_, 


(41.17) 


l-ly-r 


Notice that (41.8) is a special case of (41.15) when the basis is right-handed and orthonormal, since 
in this case e* =1,e, = ô, and, from (21.5), 


jh Jily-r 


det . i = Oo Jr 
LOE 


L~ jn-rh ` i : İn-rln-r | 
Then as in (41.17) we have 


tein In—r Shite philly 
Qe ee Ome 


<< jnr 


And thus (41.15) reduces to (41.8). 
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The duality operator can be used to define the cross product or the vector product for an 
oriented three-dimensional space. If we take N = 3, then D, is an isomorphism from ACA ) to 


HY) 
D,: FV) > F(%)=V (41.18) 
so for any u and v€% , D, (u^ v) isa vector in Y. We put 
uxv=D,(uav) (41.19) 


called the cross product of u with v. 


We now prove that this definition is consistent with the classical definition of a cross 
product. This fact is more or less obvious. From (41.2), we have 


E*(uAvaw)=u-(vxw) forallweVv (41.20) 


where we have replaced the * inner product on the right-hand side by the - inner product since 
uxv and w belong to ¥. Equation (41.20) shows that 


(uxv)-u=(uxv)-v=0 (41.21), 


so that u x v is orthogonal to v. That equation shows also that ux v = 0 if and only if u,v are 
linearly dependent. Further, if u,v are linearly independent, and if n is the unit normal of u and 
v such that {u,v,n} from a right-handed basis, then 


(uxv)-n>0 (41,21)> 
which means that u x v is pointing in the same direction as n. Finally, from (40.33) we obtain 


[E*(uavaw) || E*(uavaw)|=(uxv)-(uxv) 
u-u u-v 0 
=det| v-u v-v 0ļ|= lal? v - (u-v) (41.22) 
0 0 1 


= [tall [v (1-208? 0) = ful [vf sin’ o 


where @ is the angle between u and v. Hence we have shown that u x v is in the direction of n 
and has the magnitude |[ul|||v||sin @ , so that the definition of ux v, based on (41.20), is consistent 


with the classical definition of the cross product. 
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From (41.19), the usual properties of the cross product are obvious; e.g., ux v is bilinear 
and skew-symmetric in u and v. From (41.8), relative to a right-handed orthogonal basis 
{i,,i,,i,} the components of ux v are given by 


(uxv) => En (uv; -— uv; ) = Eq lV, (41.23) 


i<j 
where the summations on i, j in the last term are unrestricted. Thus we have 


(ux v), = UV; — UV, 
(uxv), = u,v, — U,V; (41.24) 


(u A v); = UV, — U,V: 
On the other hand, if we use an arbitrary basis fe' \ , then from (41.15) we have 


(uxv), => (uv; -uv eee, =e'e,e"u,v, (41.25) 


i<j 


where e’ is given by (40.31), namely 
e= e(det[e’]) ” 


Exercises 


41.1 If De %(V), what is the value of D,D ? 


41.2 If {e'} is a basis of Y , determine the strict components of the dual D, (e" Ar e") ) 


Hint. The strict components of et A--- Ae" are fa „j< < j,} since as in (41.17) we 


have 


» Ope Av Ae =e A Aer (41.26) 


41.3 If A is an endomorphism of a three-dimensional oriented inner product space Y , show 
that 


Au - (Av x Aw) = (det A) u- (vxw) (41.27) 


and if A is invertible, show that 
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Avx Aw =(detA)(A“) vxw (41.28) 


Sec. 42 s Contravariant Representation 287 


Section 42. Transformation to the Contravariant Representation 


So far we have used the covariant representations of skew-symmetric tensors only. We 
could, of course, develop the results of exterior algebra using the contravariant representations or 
even the mixcd representations of skew-symmetric tensors, since we have a fixed rule of 
transformation among the various representations based on the inner product, as explained in 
Section 35. In this section, we shall demonstrate the transformation from the covariant 
representation to the contravariant representation. 


Recall that in general if A is a tensor of order r, then relative to any reciprocal bases {e,} 


and {e'} the contravariant components A’ of A are related to the covariant components A, of 
A by [cf. (35.21)] 


iy... ij i,j 
A" De aera. j 
1eJr 


(42.1) 
This transforn lation rule is valid for all rth order tensors, including skew-symmetric ones. 
However, as we have explained in Section 39, for skew-symmetric tensors it is convenient to use 


the strict components. Their transformation rule no longer has the simple form (42.1), since the 
summations on the repeated indices j,--- j, on the right-hand side of (42.1) are unrestricted. We 


shall now derive the transformation rule between the contravariant and the covariant strict 
components of a skew-symmetric tensor. 


Recall that the strict components of a skew-symmetric tensor are simply the ordinary 
components restricted to an increasing set of indices, as shown in (39.10). In order to obtain an 
equivalent form of (42.1) using the strict components of A only, we must replace the right-hand 
side of that equation by a restricted summation. For this purpose we use the identity [cf. (37.26)] 


F 


T kik 
Ajj, = Åj] = rae Ay (42.2) 


where, by assumption, A is skew-symmetric. Substituting (42.2) into (42.1), we obtain 


| Le > 
Att =p ehe hA y (42.3) 


r! JieJr Kr 
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Now, since the coefficient of A, is skew-symmetric, we can restrict the summations on k, ---k, 


to the increasing order by removing the factor 1 ae that is 


TE eka tee 
AOS DO een, (42.4) 
k,<+<k, 


This is the desired transformation rule from the covariant strict components to the contravariant 
ones. Clearly, the inverse of (42.4) is 


Ai = Da Oe. ee Ae (42.5) 


ky <--<k, 


which is the transformation rule from the contravariant strict components to the covariant ones. 


From (21.21) we see that (42.4) is equivalent to 


Arb = 5 etikk A 
ieke 


k<+<k, 
| gtk oo y ei | 
(42.6) 
= 2, det| - o [A 
k,<--<k, 
elk . et" 
while (42.5) is equivalent to 
— k k, 
A, ly = > Ci isk unk, i 
k,<--<k, 
ik, Èk, 
(42.7) 
= J det} - [ates 
k,<-<k, 
ei ky Èk, 


In particular, when r = N , we have 
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A" = det] e" |A, y, A.n =det| e, |a" (42.0) 


From the transformation rules (42.6) and (42.7), or directly from the skew-symmetry of the 
wedge product, we see that the product basis of any reciprocal bases {e,} and fe \ obeys the 


following transformation rules: 


| otk oog oh eke 7 
e' A. Aer = Da det| . -Je Aene (42.9) 
1 Tr 
k,<--<k, 
elk oirke 
and 
eik Eik, 
k 
e ^uae = y det| - ONN (42.10) 
k,<<k, 
ei k, S S ek, 


In deriving (42.9) and (42.10) we have used the fact that the strict covariant components of 
et A--- Ae" are 


oO) Beret 


and, likewise, the strict contravariant components of e, A--- Ae, are 


(aithe 


elp 


as shown by (41.26). If we apply (42.9) and (42.10) to the product bases e' A--- Ae” and 
e ^t Aey, we get 
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eaae” =det| e? je, A---Ae, 
ee (42.11) 
1 N 
e ^ Aey =det| e; |e A---Ae 
The product bases 
{e" Ar ARI << i,} and fe, Aene h< i,} 
are reciprocal bases with respect to the inner product *, since from (39.30) we have 
ny 
(e' A ne )#(e, Ave, ) = det 7 =O (42.12) 
Oy. pe Oi 
In particular, when r= N we have 
(e'A---Ae™)#(e,A--Ae,y)=1 (42.13) 


From (42.12), we can compute the * inner product of any two rth order skew-symmetric tensors 


A *B = > Atk By, x. = > A Be (42.14) 


k,<<k,, k,<--<k, 


These formulas are equivalent to the formula (39.31), which is based on the covariant strict 
components of A and B. 


For an oriented space we have defined the density e* of a basis {e'} to be the components 


of e' A--- Ae” relative to the positive unit density E , namely 


eane eE (42.15) 


as shown by (40.28). Clearly we can define a similar component for the basis {e,} , hamely 
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LA Ae, =eE (42.16) 
Further, from (42.13) the components e and e* are related by 


ee =1 (42.17) 


In view of this relation, we call e the volume of {e,}. Then {e,} is positively oriented or right- 
handed if its volume e is positive; otherwise, {e,} is negatively oriented or left-handed. As 


before, a unimodular basis {e,} is defined by the condition that the absolute volume |e| is equal to 
unity. 


We can compute the absolute volume by 
e’ =det| e, | (42.18) 
or equivalently 
lel=(det[e, |) (42.19) 


The proof is the same as that of (40.29) and (40.30). Substituting (42.19) into (42.17), we have 
also 


1/2 
Ar Aey = e(det[e, |) E (42.20) 
where ¢ is + if {e,} is positively oriented and it is — if {e,} is negatively oriented. 


Using the contravariant components, we can simplify the formulas of the preceding section 
somewhat; e.g., from (42.6) we can rewrite (41.15) as 


(D „A) Jie jn-r =: A,. ve * gle drj jN-r (42.21) 


i<-<i, 
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which is equivalent to 
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(DA) n, = D A a (42.22) 


i<-<i, 


Similarly, (41.25) can be rewritten as 


(uxv) = ee uy, 


which is equivalent to 


(uxv), =eg,,u'v! 


Exercises 


42.1 Prove the formula (42.22). 


(42.23) 


(42.24) 


42.2 Use the result of Exercise 41.2 or the transformation rules (42.21)and (42.22) and show that 


D,(e' A---Ae") = Dd Cet E 


dis<Jr 


which is equivalent to 


eee aea 5 h 
D, (e, A Ae, ) Dy Cenia n oe 


h<=<jr 


42.3 Show that has the representations 


dt aes 
E=-e"""e, @---@e, =e¢ 
e 1 N 


where 


Jy jy 


fs (42.25) 
eea ehs (42.26) 
Qe” (42.27) 
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L pii =eg&. . e`) wy% enin (42.28) 
e Ji JN 


and 


e=«(det[e, |) (42.29) 
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